Mittwoch, 29. September 2010

Monolog++

simon 0.3.0 has been released about two weeks ago and this means that even tough temperatures outside disagree it's once again summer in trunk!

Our newest addition is actually one that has been in the works for quite some time (in a separate branch) and presents itself to the user as a new plugin: The Dialog plugin.

While simon 0.3.0 is ideally suited to silently execute the commands you tell him, the next version of simon will talk back.

The basic idea is quite simple: The dialog system lets the user define an arbitrary amount of states which each has some transitions to move on to other states of the dialog. Each transition can also execute other simon commands if configured to do so.

An example use case could look like this:
Every day at 10 am the system displays a dialog that asks the user if he has already taken his medication. Yes -> "Great!"; No -> "Do you need help?" -> etc.

But you could also create quite complex menus like this:
You: "Computer!"
Computer: "Hi! What do you want to do? Say any of the following options: Read e-Mails; Browse the web; Check calendar; Close"
You: "Check calendar"
Computer: "Alright. These are your upcoming events: Birthday at Susies Place in Graz"
You: "Where am I?"
Computer: "You are in Graz, Austria."
You: "Thanks"

Sounds ridiculous, right? Well, let's look at this example in a little bit more detail...

States, Options and Transitions
The above dialog is a quite simple state based dialog. You can see three states: "Welcome", "Calendar" and "Location". We can define them in the dialog configuration.



You can see that dialog options to continue to other states can be added there as well. The text of the state actually goes through a templating engine so you can define paramters for that in the "Template options" page.

But more interesting is probably the "Bound values" page. There you can define variables and bind them to values. Those values can be static, determined at runtime through QtScript (Javascript) or values from plasma dataengines.

For example you can bind $currentTime$ to the Local/Time of the date and time plasma data engine. And because there already lots of great plasma data engines this means that the dialog system is already quite powerful.



TTS
Remember that I said that simon will talk back to you in simon 0.4? Yes we also have an all new TTS layer but I'll cover this in a separate blog post as this one is already too long :)



Demo
So to show off the current state of development, I created a very short demo video displaying the dialog above. While the code is not production ready, I didn't cheat: both the upcoming events and the location is determined dynamically, at runtime, through plasma data engines (upcoming events use the calendar data engine to get data from your Akonadi calendar).



For RRS readers here a direct link.

Sonntag, 19. September 2010

simon at the AAL Forum 2010: Again

The whole simon listens team attended this years AAL forum in Odense, Denmark.


The AAL Forum is a plattform for projects of the ambient assited living joint program of the European Union and related projects. After having attended the Akademy this year it was quite interesting to see the other side of software development with almost all the projects there being quite well funded :).

Despite the very steep attendance fee (€ 450) the exhibition was quite active. In just three days we collected more than 30 business cards of interested people - many of them looking for project partners for their next projects.

All in all it was very interesting so to see related projects, discovering similarities and potential synergies. Cooperations across projects - even in the same call - are still far too seldom in my opinion.

Quite some people were surprised to find out that simon is open source and completely free ("Where are the hidden costs?") so we also got to introduce some people to the concept of free software.

On Thursday we we then got to see Dj Ruth Flowers at the networking dinner. And it was just awesome seeing so many suits dance :P.


I really wouldn't have thought that such a formal event could be turned around into a wild party just with a good DJ. Suffices to say: The booths were quite empty the following day.

We then spent the last evening in Copenhagen before flying home which as it turns out is a great city - and they have great cocktail bars as well :)

Montag, 13. September 2010

Application centric speech recognition for your desktop: simon 0.3.0 released

The new version 0.3.0 of the open source speech recognition simon has been released and boasts the all new scenario system allowing you to build your own customized speech recognition system with just a few mouseclicks.

With simon you can control your computer with your voice. You can open programs, URLs, type configurable text snippets, simulate shortcuts, control the mouse and keyboard and much more.

Because of simons architecture, it is not bound to a specific language and can be used with any dialect. It is also specifically designed to handle speech impairments which makes simon a viable alternative to conventional input methods especially for physically disabled people and senior citizens.

simon is based off the open source large vocabulary continuous speech recognition engine Julius.

New in simon 0.3

simon 0.3 introduces an application centric approach to speech recognition by using packaged use cases of the speech recognition called "scenarios". Scenarios contain the complete configuration for one specific task like controlling Firefox or using the voice controlled on screen keyboard. These scenarios can then be shared with other simon users and are collected in a central online repository which can be accessed directly from within the application.

Besides the scenario system the new version also provides the user not only with the possibility of creating his own model through training but also to use an existing acoustic model (base model) to get started even quicker - entirely without training.If the user wants more control or would like to improve recognition accuracy, personalized training is possible through the optional HTK (not included in simon due to license restrictions). simon then offers to adapt the used base model to your own voice or to create a new model entirely from scratch.

Additionally, we have been working hard to make simon even easier to use. Some of the more notable results of these efforts are the new introductory wizard that guides you through the initial setup as well as the speech model generation adapter that automatically fix a vast variety of common beginners mistakes for you.

Furthermore simon 0.3 introduces three new applications to the suite. Sam, an acoustic modeling tool is geared towards professionals who want to tinker with their speech model and get the best recognition out of it. It is also a great tool to create and test large models which can then be distributed as base models for other simon users. To create base models you also need a lot of speech data which can be easily collected through the newly introduced combo of ssc and sscd. ssc stands for simon sample collector and is the client to the sscd server. Together they provide a powerful, cross platform tool to collect samples from lots of different speakers - even allowing you to record with multiple microphones and / or sound cards simultaneously.

Demonstration




Readers of the RSS feed: Watch it on Youtube

Download

You can download simon 0.3 as source archive but there are also packages available for Windows, OpenSUSE and Ubuntu on our Sourceforge page. Up to date installation instructions are available on the simon listens wiki.

Sonntag, 12. September 2010

simon at the AAL Forum 2010

I'm happy to announce that the simon listens team will attend this years ambient assisted living forum in Odense, Denmark!

The AAL Forum (15th-17th September) is an annual event as part of the Ambient Assisted Living Joint Program of the European Union. The main goal of this program is to improve everyday live for healthy seniors.

While this might not sound as the most exciting topic at first, this is a fast moving, exciting field of research that covers everything from home automation to assistive robotics.

We will be represented through a booth in the exhibition hall and Franz Stieger, our chairman, will give both a short introductory talk simon listens and another one about the project in the context of robotics enabled assisted living.

It isn't all work and no play, tough.  In the spirit of the conference an internationally proclaimed 69 year old DJ called Ruth flowers will apparently rock the the social event of the conference. I can't wait to see that :)