I don't want a kit, I don't want to compile ... I want something where I can install some software and talk to my word processor.
What is available for Linux?
I don't want a kit, I don't want to compile ... I want something where I can install some software and talk to my word processor.
What is available for Linux?
I'd like to know this to.
I know there's Orca, but I haven't quite figured out what it does or how to use it yet.
I am not a lawyer...yet.
Music Manumit Podcast - remixable Creative Commons music!
My neglected blog: http://douglasawh.wordpress.com
My website: http://opensourceplayground.org
I can't answer the first question, but Orca definitely is not directly involved in recognizing speech. Orca processes what's already on the screen and reads it aloud to visually-impaired folk. Just wanted to clear that up.
The only voice-to-text project I know of is in Google Summer of Code this year, which is aiming to enable dictated voice notes in Tomboy, the sticky note application:
http://marcondes.wordpress.com/2008/...anbul-edition/
If that ultimately gets finished, then I would imagine that someone could develop a similar plugin for OpenOffice.org. But I think that would probably take a lot of time to happen.
You're not the only person interested in this, though. There's even a Speech Recognition page in the Ubuntu wiki: https://wiki.ubuntu.com/SpeechRecognition
Otherwise, I've just heard of people using Dragon Naturally Speaking with Wine.
http://appdb.winehq.org/objectManage...rsion&iId=5402
http://ubuntuforums.org/showthread.php?t=168711
In my experience voice recognition systems are dreadful. The technology isn't ready yet. Some years ago I heard on News At Ten on ITV in Britain a demo of Dragon Naturally Speaking (I think) and it was painful. Any secretary using it would have to have time off for stress! It....was....very....slow....you....had....to....s peak....each word with a pause in between for the system to work. I also find the voice recognition systems for cinemas appalling as well. I call them the "Do you mean Birmingham?" systems. I live in Southampton when it says "Say the town you want film information for?" I say "Southampton" and it says "Do you mean Birmingham?". Totally useless! Also on BBC Watchdog (A consumer Program on BBC Television) they featured Nintendo "Brain Training" game for the Nintendo DS. The voice recognition can only recognize American voices! There were a lot of complaints about it.
If these things are not sorted out pretty quick. No one will use voice recognition at all and it will go the same way as Head Mounted Displays for virtual reality, that is to say nowhere and the technology will disappear without trace.
Also make sure you have a headset when using Voice Recognition, because the quality of the microphone is crucial! This is one reason why I think the phone versions of this technology are pathetic since you have a standard input with standard microphones and you should be able to get a system working far more easily if you know what you input will be like, rather than having people speaking a varying distances from different microphones in different environments.
The only problem with phones, is it has different people talking different distances from the microphone in different environments, suffering the same problem as an PC based stuff...
I also live in Southampton, but Odeon always thinks I say Northampton - it's like it skips a whole syllable each time!
It is definitely being worked on.
From Gnome 2.24 release notes:
2.24 is the Gnome for Intrepid Ibex. It is still in a testing phase, so use only on a spare computer.3.3. Better Screenreading
GNOME and its partners have worked hard to improve accessibility and screenreading support for both GNOME 2.24 and many popular third party applications.
Text-to-speech and braille device support is now vastly improved for Java applications, OpenOffice.org, Mozilla Thunderbird, Pidgin, GNOME's Help Browser and the GNOME Panel. Users are now made aware of unfocused dialogues when switching to an application.
There has also been a lot of work to integrate GNOME's screen reading technology with ARIA-enabled web browsers, starting with Mozilla Firefox.
Also new is automatic selection of the synthesised voice based on the system language, support for verbalised links, echo by sentence and optional tutorial messages.
Learning is not attained by chance, it must be sought for with ardor and attended to with diligence. Abigail Adams ( 1744 - 1818 ), 1780;
My blog Poetry and More Free Ubuntu Magazine
Screen Reading is Text to Speach.
What the original poster was asking about, is Speech to Text.
(Roughly 1% of the general population needs Speech to Text hardware and software, to use a computer.)
To answer the original posters question, try Sphinx-4.
(Synaptic has Sphinx-2, which is an older version.
Sphinx-4 can be obtained from http://cmusphinx.sourceforge.net/sphinx4/
Note: Read the description of the differences between Sphinx-2 and Sphinx-4, before deciding which to isntall. (There is at least one more version, but I've forgotten its designation. )
xan
jonathon
And now I bend the knee of my heart,
Imploring you for your kindness.
Hi All,
New to the thread but I do know of a speech to text deveoplment project based around tcl. This by no means is an ontsall and go bit of software but has the potential and I am providing it for completeness.
http://www.icsi.berkeley.edu/~dpwe/p...prachcore.html
http://www.inf.ed.ac.uk/resources/nlp/
As a side note i am implementing the MS SAPI engine into Ubuntu see my SAPI post which includes speech to text engine. I am working on text to speech but this could also be exploited using a wrapper to SAPI.
Tom
Last edited by notlistening; October 2nd, 2008 at 02:10 AM.
Great wording for my expectations, too, although in my case it is more like feeding an audiofile (recorded while I juggle) to an application, which will output a draft text, which I will then read through, correct where necessary and handle to further processing. Does anybody use something like that?
Bookmarks