Simon: Open-Source Speech Recognition: Dezember 2012

After years of hard work, the Simon team is proud to announce the new major release: Simon 0.4.0.

New in Simon 0.4

This new version of the open source speech recognition system Simon features a whole new recognition layer, context-awareness for improved accuracy and performance, a dialog system able to hold whole conversations with the user and more.

Revisiting Usability

A lot of work has gone into making Simon easier to use - both for existing and new users.

Perhaps most visibly, the main window of Simon has been reorganized to bring the most important options together in one screen.

Simon 0.4.0: Main window

Moreover, the newly introduced Simon base model format (.sbm) and the integration of a GHNS online repository of base models have removed the last big hurdle of the initial configuration.
One can now easily go from a fresh installation to a working setup in less than 5 minutes without any preparation. Don't believe me? Check out the quick start below!

Simon 0.4.0: Quick Start

Many other, smaller changes sum up to one simple but important difference: Simon will overall require less user interaction while achieving more.

SPHINX

One of the major internal changes of Simon 0.4 is of course the included support for the BSD licensed CMU SPHINX. While we still also maintain full support for HTK and Julius, new models compiled with Simon will default to the SPHINX backend and the (proprietary) HTK is no longer required to build user-generated models.
Best of all: Simon will select the correct backend for your configuration transparently and automatically.

Voxforge

A major problem of open source speech recognition has always been the lack of freely available high quality speech models.

The Voxforge project has been working for years towards GPL acoustic models for a variety of languages. While their models are certainly not yet perfect, they offer a promising starting point.
The English Voxforge model is of course available as a Simon base model and can be downloaded and imported with Simon.

Additionally, starting with Simon 0.4, users will also have the option to contribute their gathered Simon training samples directly to the Voxforge server.
These recordings will then be used to train and improve the general acoustic models.

Simon 0.4.0: Training

By the way: Behind the scenes this upload is based on SSC.

Context

There is a simple rule of thumb in speech recognition: The smaller the application domain, the better the recognition accuracy. This was always one of the core principles of Simon.

In Simon 0.4, however, we went one step further: Simon can now re-configure itself on-the-fly as the current situation changes. Through so called "context conditions" Simon 0.4 can automatically activate and deactivate selected scenarios, microphones and even parts of your training corpus.

For example: Why listen for "Close tab" when your browser isn't even open? Or why listen for anything at all when you're actually in the next room listening to music? Yes, Simon is watching you.

Simon 0.4.0: Context awareness

Dialog System

Simon 0.4.0 also ships with the new dialog system featuring scripted variables (Javascript), integration with Plasma data engines, a templating system and - of course - text-to-speech output.

Simonoid

For users of KDE's plasma workspace, we now provide the "Simonoid" plasmoid to start and monitor Simon - including the current recording volume.

Simonoid

The screenshot above shows two instances of the plasmoid: One added to the panel and another one to the desktop.

... and everything else

Please don't be foold to think that the above is a complete list of all improvements. For example, we also have a new sample review tool called Afaras, integration with the Sequitur grapheme to phoneme framework, an Akonadi command plugin and many, many other noteworthy changes.
You'll have to try out Simon to see for yourself!

Download

To install Simon 0.4.0, you can either compile the official source tarball, install a binary package provider by our Linux distribution or use the installer for Windows.

Microsoft Windows: Installer

Source Code

If you are a packager and would like to package Simon 0.4, please do get in touch with us. Thank you.

As of right now, the first release candidate of Simon 0.4.0 is available:

As can be expected, a lot of bugs have been fixed since the last beta.

However, that's not all that changed: This release candidate also comes with complete handbooks for all Simon applications. Next to documenting the plentiful new features, I also completely restructured the Simon handbook to hopefully provide a better starting point for new users.

Moreover, the windows installer has been vastly improved and now actually ships a fully fledged Simon version with Julius and SPHINX support. This means that you can use any Simon base model right from the start and even build your own speech model from scratch without installing any additional software.

This brings me neatly to the call for packagers: If you want to help package Simon, please get in touch with me.

Simon: Open-Source Speech Recognition

Sonntag, 30. Dezember 2012

Simon 0.4.0

New in Simon 0.4

Revisiting Usability

SPHINX

Voxforge

Context

Dialog System

Simonoid

... and everything else

Download

Donnerstag, 20. Dezember 2012

Simon 0.4.0: RC1

Sonntag, 9. Dezember 2012

Simon 0.4: Beta 2

Mittwoch, 5. Dezember 2012

Simon 0.4: Beta 2 Update