PDA

View Full Version : Text to Speech


Robotman
June 2nd, 2007, 10:34 AM
Howdy,
I just installed Ubuntu 7.04 and I really like it, especially its speed compared to XP. Unfortunately, I can't seem to get any text to speech software to work yet. If I could get a speech synthesiser to read text for me like I can with windows (using "TextAloud MP3"), I could abandon that other OS altogether, as 'reading' news is what I do most with this PC.
Is there any good text-to-speech software that will run in Ubuntu? I've read about and downloaded something called "Festival" but I don't know what to do with all those files and I've read that it doesn't have a GUI. Any tips to have me listening to my text would be very welcome. ;)

frafu
June 2nd, 2007, 03:21 PM
Ubuntu Feisty ships with a screenreader named Orca.

Have a look at the menu System->Preferencs->Accessibility->Assistive Technology Preferences.

That might help you.

Francesco

Robotman
June 2nd, 2007, 04:40 PM
Thanks. That's a great start... but ideally I want something that only reads what I tell it to read, not every window that comes to the foreground.

joann
June 9th, 2007, 02:07 PM
I am visually impaired and often find it nice to have software read for me as an addition to my screen magnification software. So I do understand the desire to have software that will read what I tell it to read.

There are several programs that may help you :)

Here is a small list of useful programs and add-ons. The links have more details about the programs and how to install them...

Emacspeak- this is a talking extension to the emacs editor
<http://emacspeak.sourceforge.net/>
This can be installed easily using the synaptic package manager or 'apt-get install emacspeak'. but make sure that all of the dependencies are satisfied prior to installing emacspeak other wise it will not function properly.
KTTS- KDE text to speech system
This works well with KDE, and allows items on the clipboard to be spoken.
<http://accessibility.kde.org/developer/kttsd/>
I think this should be installed by default, maybe. But if it not, you can install it using synaptic package manager. The website will have a more information.

Read Out Loud- This is for the Adobe PDF reader, it allows for pdfs to be spoken. Very nice when you just want to listen to a PDF document.
This link has information on how to set up Read Out Loud
<http://help.adobe.com/en_US/Acrobat/8.0/Standard/help.html?content=WS58a04a822e3e50102bd61510979419 5ff-7d15.html>
And this link has the software ...
<http://www.adobe.com/products/reader/>

ClickSpeak- This is an exension for firefox. It will speak web pages for you, and highlight the text while it reads. (I realy like this alot! It can also be used with or without Orca for speech which is realy nifty)
<http://clickspeak.clcworld.net/>
Click on installation guide, and follow the directions to install the extension.

Hope that you find this useful.
Have a wonderful day.

-Joann

Robotman
June 10th, 2007, 04:33 PM
Thanks for the info Joann. ;) I think KTTSD (along with KTTSMGR and KSAYIT) are exactly what I'm looking for... but I can't figure out for the life of me how to install KTTSD or Festival or its voices. What am I missing?

joann
June 10th, 2007, 10:39 PM
Robotman,

Lets see the easiest way to check is to see what software you have installed. So, if you open synaptic package manager, and do a quick search for "speech" a whole bunch of neat stuff should appear off to the upper left hand corner with a description in the lower left portion of the window. If you scroll through the list, at around the letter K you should see kmouth, ksayit, and kttsd. Make sure these have all been checked. Also, at around the letter F there should be festival too, so make sure it is installed as well.

The search results should list a bunch of goodies, including other speech synthesis software which will produce other voices. You may find these useful ... "eflite", "espeak", "flite". (Commercial software is also available to provide even more voices).

After apply the changes and installing the software you should be able to open KTTSD, by going to Applications->Accesbility->kttsmgr. First K you should see kmouth, ksayit, and kttsd. Make sure these have all been checked. Also, at around the letter F there should be festival too, so make sure it is installed as well.

Hope that this helps :)
- Joann

Robotman
June 11th, 2007, 06:18 AM
Thanks Joann, that worked. :D

TheDro
May 9th, 2008, 07:16 PM
Here's a quick question about this. Will this work even though I am using gnome and not kde. In a more general sense, can I install any kde program in the add/remove program list or do I actually have to install the whole kde desktop (probably not the right term) to be able to use these programs?

Btw, I realize that many of you probably think I should just install it, try it out and uninstall it if it doesn't work but I have dial-up and downloading is slow:(.

unutbu
May 9th, 2008, 07:26 PM
If you type

sudo apt-get --simulate install kttsd
you'll get to see which packages will be installed.

If you type
sudo apt-get install kttsd
you'll be told how many MB of space the packages will consume, and you'll be asked if you want to do this. (Note, if kttsd is the only package that needs to be downloaded, then you won't be asked. Only if there are dependencies that need to be downloaded first.)

Hydrosis
February 15th, 2010, 02:56 AM
The Ubuntu Text Reader can read any text that you paste into it and you can customize it with different voices.

You can download the latest version at:

http://xzcallaway.synthasite.com/

This is exactly what I've been looking for. I always wonder why so many Linux apps just don't work out-of-the-box and have simple GUIs, like KTTS/Kmouth and Festival.

This should be integrated into the next Ubuntu as a default app. Its small and very, very useful.

phillw
February 17th, 2010, 10:13 AM
This is exactly what I've been looking for. I always wonder why so many Linux apps just don't work out-of-the-box and have simple GUIs, like KTTS/Kmouth and Festival.

This should be integrated into the next Ubuntu as a default app. Its small and very, very useful.

Also there is a GUI for espeak in the repo' for 10.04 It is simply called espeak-gui - It reports a couple of errors reported on launch (the author is looking into them) but works fine with either pasted text, or opening a text document

Regards,

Phill.

go_beep_yourself
February 18th, 2010, 12:48 AM
I am visually impaired and often find it nice to have software read for me as an addition to my screen magnification software. So I do understand the desire to have software that will read what I tell it to read.

...

KTTS- KDE text to speech system
This works well with KDE, and allows items on the clipboard to be spoken.
<http://accessibility.kde.org/developer/kttsd/>
I think this should be installed by default, maybe. But if it not, you can install it using synaptic package manager. The website will have a more information.

Does this have an option to automatically read clipboard contents like 2nd Speech Center and TextAloud in Windows? Have you tested this with Gnome?

Read Out Loud- This is for the Adobe PDF reader, it allows for pdfs to be spoken. Very nice when you just want to listen to a PDF document.
This link has information on how to set up Read Out Loud
<http://help.adobe.com/en_US/Acrobat/8.0/Standard/help.html?content=WS58a04a822e3e50102bd61510979419 5ff-7d15.html>
And this link has the software ...
<http://www.adobe.com/products/reader/>

I've gotten this to work in Windows. I thought the Linux version of Adobe Reader just left this feature not implemented. I just searched the link you provided. Nothing comes up when doing a search on that site for linux. I am fairly sure I've tried to use this feature in Linux before a while back, and it was "greyed out" where I couldn't click on it.

I've gotten very used to having TTS software such as 2nd Speech Center and TextAloud in Windows, and it's been a real pain in the butt not to have something as functional in Linux. I started the development of a Open Source TTS software written in Java that will use existing tts engines such as festival and espeak. I've made progress on it. Although some features are unimplemented, I will put it up on Google Code, if anyone is interested in it, and I've made some progress so far. I hope to find some coders interested in developing this software too. I've got great plans for it. I just need to implement the rest of the features, but so far, I've gotten through some hurdles.

LequidMetal
February 28th, 2010, 05:00 PM
I tried getting the Read out loud feature to work in Ubuntu and was completely unsuccessful . Orca works with every other app including firefox but for some reason it wont work with Adobe acrobat 9 (or any other gnome pdf readers i tried ).And there seems to be very few tutorials on the subject .

notlistening
March 1st, 2010, 07:05 AM
There is a little application called pdf2txt which when run on a pdf will put its written contents into a text file which in turn can then be very easily read with any TTS program or orca.

go_beep_yourself
March 18th, 2010, 05:23 AM
For anybody interested in seeing an open source application similar to TextAloud and 2nd Speech Center (Windows apps) further developed, I could use some help with this project. So far, it has just been me working on it. I am trying to understand how espeak and mbrola work together, though if easier or better, I would develop this program to use festival or flite. I do not see many command line options for those apps and sence my program runs espeak with arguments gathered from the gui, I don't see the others as an option. My application is working. So far it is able to take any text highlighted and copied, and if what is highlighted is text and is different, the text is read by espeak with the default voice for horrible default espeak voice, so I am trying to understand better how espeak and mbrola work together to get some better sounding voices for this application to use. If anyones interested in seeing this project grow, please contribute whether you know how to program in Java or not, you can still contribute by trying the application and helping me to figure out how different features are used through the command line, so they can be put into the application or by packaging it as debs.

Edit: This app allows anything that can be copied as text to the clipboard to automatically be read to you whether the text is in Firefox, pdfs, email, etc.

notlistening
March 18th, 2010, 08:29 PM
I am glad to see someone making the effort to develop tools like this. I wrote a similar tools for windows back in the day when i knew what windows was.

I think the direction you want to go is to start looking at speech dispatcher. This creates a single interface into all the most commonly used speech engines. So you only have to learn about speech dispatcher not all the different TTS engines. If better text to speech is what you want then I can recommend my project open-sapi which allows Linux users to use Windows based TTS with much nicer voices. It is under development but I have very kindly added in a command line interface for it which is quite straight forward.

The wiki information is quite out of date and there is a hidden development branch that has taken on new features and got quite a lot of bug fixes. So if you're online on irc MSN skype and your interested then let me know.

2hot6ft2
March 18th, 2010, 09:16 PM
I am trying to understand how espeak and mbrola work together, though if easier or better, I would develop this program to use festival or flite.
You might take a look at Gespeaker it' a front end to espeek and uses Mbrola add ons which are a bit more tolerable and it has some controls for how they are managed along with a decent GUI. Sorry but I have no programming knowledge to help with anything.
Here's the home page for it.
http://code.google.com/p/gespeaker/

hellocatfood
March 23rd, 2010, 06:22 PM
You might take a look at Gespeaker it' a front end to espeek and uses Mbrola add ons which are a bit more tolerable and it has some controls for how they are managed along with a decent GUI. Sorry but I have no programming knowledge to help with anything.
Here's the home page for it.
http://code.google.com/p/gespeaker/

Thanks for that! It's the best/easiest one that I've used so far

go_beep_yourself
March 23rd, 2010, 11:36 PM
You might take a look at Gespeaker it' a front end to espeek and uses Mbrola add ons which are a bit more tolerable and it has some controls for how they are managed along with a decent GUI. Sorry but I have no programming knowledge to help with anything.
Here's the home page for it.
http://code.google.com/p/gespeaker/

I have gespeaker, and although it is a nice app, it does not do what I want. I don't need any coding help. What I need help is to understand how mbrola and espeaker work together, so I can further develop the application I have.

I was about to continue developing my Text To Speech software some more, and have it use mbrola voices.

I looked at the documentation here.

file:///usr/share/doc/espeak/mbrola.html (note: you can open this in Firefox by pasting the above line in FF's address bar)

And it says

"The Mbrola voices are cost-free but are not open source. They are available from the Mbrola website at:
http://www.tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html"

So I go to that website, and there are many downloads listed. I'd like to know what the differences are and what each download is for.

From the website:

# LINUX i386 / ppc / alpha / ultra1 <-- these I am not certain about
# LINUX / Pocket PC <-- This must be for a PDA running Linux
# LINUX / Ircha / Mbrola / ES1 DBA / Z80 <-- I don't know what the different downloads here are for, but I'd like to know
what each are. They may all be beneficial to the program I am creating.
# AMD64/Linux <-- this one is obviously for Linux users running a 64-bit kernel
# ARM/Linux (tested on Nokia N800 and N810) <-- this one is obvious it runs on cell phones

Could someone please explain what the difference between these download are and what the ones I am not sure about do? It's not the programming I am stuck on. It is things like this that I must understand in order to further the program, and I want it my program to be able to used by others, not just myself.

VastOne
March 24th, 2010, 12:08 AM
You might take a look at Gespeaker it' a front end to espeek and uses Mbrola add ons which are a bit more tolerable and it has some controls for how they are managed along with a decent GUI. Sorry but I have no programming knowledge to help with anything.
Here's the home page for it.
http://code.google.com/p/gespeaker/

I followed the gespeaker instructions to a T and successfully installed both it and Mbrola.

But where is it? I have no starting app icon anywhere and running gespeaker or Gespeaker from terminal gives me nothing???

Any help appreciated!

go_beep_yourself
March 24th, 2010, 02:31 PM
I followed the gespeaker instructions to a T and successfully installed both it and Mbrola.

But where is it? I have no starting app icon anywhere and running gespeaker or Gespeaker from terminal gives me nothing???

Any help appreciated!

dpkg -L gespeaker

Talorgan
April 16th, 2010, 07:18 PM
Is there such a thing as text-to-dialogue software?

The idea would be that the computer would give the different characters of a play different voices so that you could hear the conversation.

notlistening
April 18th, 2010, 03:10 PM
Umm not sure for other TTS engines but SAPI / opensapi can be used with speech XML where you can define how the speech engine processes the speech throughout a document. So you can define different voices / pitch / speech/volume for different parts in theory. Other engines might support similar facilities.

NL

hellocatfood
April 19th, 2010, 12:12 PM
Another one to look out for is Simon (http://simon-listens.org/index.php?id=122&L=1). It's only for newer versions of KDE though...

Talorgan
April 22nd, 2010, 12:38 PM
Thanks

I'll check these out

wesselvanpersie
June 8th, 2010, 11:42 AM
Does there exist some free text to speech software which uses Unit selection synthesis?
So using a large database of recorded speech to make new speech.

I think both eSpeak and geSpeaker use Diphone synthesis. Which uses only a small dataset of bi phones. While these programs are small, they tend to sound robot like.
Like this:
http://www.youtube.com/watch?v=M5Z2YGV2CJ0

The current state of the art of text to speech is so much better then this :-/!

Elfy
June 20th, 2010, 09:13 AM
Is this spam?

it was

vangop
June 24th, 2010, 03:04 PM
Hey guys!
I'm too used to having nice TTS that I can't accept festival/espeak on linux. They sound too much like "microsoft Sam default voooooice" :) and non-english languages are unacceptable.
I have great nextup/acapela voices which worked great out of the box in Win with coolreader book reading app and few others as well.
In 10.04 ubuntu I can't install them under wine.
Coolreader works fine.
Acapela license manager simply doesn't see the license (reported in wine appdb), and nextup voice is not visible by coolreader. It must have been incorrectly registered in wine registry or something.
Was anyone able to install/use commercial win sapi voices in ubuntu?

notlistening
June 25th, 2010, 08:20 AM
Was anyone able to install/use commercial win sapi voices in ubuntu?

Yeah I am, using various SAPI Speech engines. I have not tested all of them though. You might want to pm me and have a read of:

http://code.google.com/p/open-sapi/

Please bear in mind that it's still under heavy development.

NL

wesselvanpersie
July 6th, 2010, 07:47 AM
Could you report your findings?

I'm looking for a good English text to speech application, better then eSpeak.

I guess I'm willing to pay money for it.

But preferably I would like to use some open source software, so I can make modifications myself.

As I said earlier, not all text to speech applications work in the same way.
These formant based synthesis / di-phone synthesis applications like eSpeak make it really fatiguing to listen to it for more then 5 minutes.

Unit selection synthesis applications take up a thousand times more disk space, but I find them much nicer to listen to.

vangop
July 7th, 2010, 05:26 AM
sapi voices work fine on wine. I'm using next-up voices over wine. They are commercial though. I didn't manage to install acapela voices, its license manager won't see the trial license so I couldn't try it, but people report it is better.
I spent a day trying to debug the stupid licman with strace to see why it doesn't see the files, but still no luck.
I didn't find a way to read web pages with text aloud so I removed the stupid thing :). I use coolreader2 to read books I download.

notlistening
July 8th, 2010, 07:16 PM
Can people list the voices they have working with wine please just out of intrest and those that don't.

I have the basic MS voices:
Sam
Mike
Mary
The newer MS Vista/7 voice Anna

VoiceWare Kate16k
VoiceWare Paul16k
VoiceWare Julie

Can people add to the list as they try good and bad ones please.

NL

vangop
July 9th, 2010, 12:45 AM
I've tried Loquendo voices, all work fine, NextUP voices (Katerina), no problem as well.
I failed to install (not run) Acapela due to the stupid license manager which doesn't see the license.
IMHO as long as you can install a voice engine, you will have it working OK.

wesselvanpersie
August 4th, 2010, 08:12 AM
Do these voices sound as crappy as this?
The intonation is just so terrible.
The pitch goes up and down where its not supposed to go up or down.

http://www.innoetics.com/media/tts_iandemo_en.mp3

here is an online demo which allows you to input your own words:
http://www.acapela-group.com/text-to-speech-interactive-demo.html

kline
December 11th, 2010, 07:59 PM
Is there such a thing as text-to-dialogue software?

The idea would be that the computer would give the different characters of a play different voices so that you could hear the conversation.

The "festival" suite that was started by Prof Alan Black and many others over in Edinburgh is worth looking at. If you google up "festival, tts"
you'll find truckloads of material. It is the most complete TTS around.

vangop
December 20th, 2010, 06:12 AM
I was too optimistic saying that SAPI installs just fine. If you install it with default settings, the voices might be messed up/not registered properly.
Found some workarounds and tried to outline it http://ubuntu-answers.blogspot.com/2010/12/text-to-speach-sapi-on-linux-ubuntu.html

Talorgan
December 20th, 2010, 06:24 PM
The "festival" suite that was started by Prof Alan Black and many others over in Edinburgh is worth looking at.

Thanks!

I'll take a look.

Presumably seeking "text-to-play" software is just a bit optimistic at the moment?!

("Play" as in "theatre production")

notlistening
December 21st, 2010, 02:00 PM
Yes I have been using/developing using comercial windows TTS engines under linux. Things have growned to a halt due to a new very demanding job and no other help on the project. The key to getting it all to work is in wine setting the default OS to windows 2000 and then installing the SAPI installer. You can find a quick way to do all thing here: http://code.google.com/p/open-sapi/wiki/DeveloperEnvironment

This will get you up and running using the Neospeech voices at least. I can not speak for others but generally you can get them working with a bit of work.

Drop me a message if you need further help.

LewRockwellFAN
January 25th, 2011, 03:26 AM
NextUp and Loquendo both work fine in Wine. And both sound great. Under NextUp you can install the Ivona voice.* You'll have to google around a bit for the exact procedure on the NextUp as it is a bit tricky. If I recall you install it in Wine emulating one variety of windows and run it under another variety or something like that. I've done it and it worked fine and Ivona is a great voice. But that was a natty installation I've since junked for other reasons and it was a free trial on the NextUp anyway. A friend of mine has the Loquendo running under wine and several voices and he can make it do tricks. Sounds good.

*I wasn't quite right about that. See later post in this thread.

LewRockwellFAN
January 25th, 2011, 03:37 AM
One more thought, xzcallaway (http://ubuntuforums.org/member.php?u=648712), I truly mean no offense and your site looks interesting, but why should anyone install a deb from someone they know nothing about? It's not like getting it straight from some well known project or compiling source you have written yourself. It seems to be a major security breach potentially. If you can explain why I am wrong about this I would be glad to hear it. Seriously, I'm sure your debs are fine, but isn't this a pretty bad practice? I may be totally offbase on this and if so, I'd appreciate being educated. But it seems contrary to what I understand as the common perception of secure procedure. Maybe it's acceptable to just scan them with clam or something. I'd appreciate informed comment on this point.

budix
February 24th, 2011, 05:05 AM
What youre saying is completely true... I know that everybody must say the same thing, but I just think that you put it in a way that everyone can understand. (http://www.aatma.org)...I also love the images you put in here... They fit so well with what youre trying to say... Im sure youll reach so many people with what youve got to say...

LewRockwellFAN
April 4th, 2011, 04:18 PM
Since my previous post in this thread, I've experimented with TTS again and I need to correct one mistatement. While I have been able to get the Ivona voices working on a free trial they do NOT continue to work for me after the next reboot. Despite the fact the trial is supposed to be for 30 days, the liscences expire as soon as I reboot. I didn't realise this the first time I tried them cause the Natty installation screwed up hopelessly on reboot and I had to reinstall. Installing Ivona takes a loooooooooooooooooooooooooong time so I'm not going to experiment with it any more unless I have some reason to think that problem is fixed. Way too much work to go to just on the chance you might be able to get it right with another try. I tried under both versions of Wine in the repository. A VirtualBox or similar VM with Windows in it might work, but it takes at least XP and the only Windows I have been able to install in VirtualBox is 2000. I tried a W7 trial disk but VirtualBox fails to install it giving an erroneous low disk space on host error message and I haven't been able to figure out Faubox. Darn things sound great though. If anyone gets a trial to work I'd like to hear about how they did it. But I'm sure not going to buy it without getting the trial to work cause the licsensed version might have the same problem.

zoubidoo
December 30th, 2011, 12:47 PM
Lucid LTS + Virtualbox + WinXP + Ivona trial version works perfectly. Just make sure you give the VM plenty of disk space.

I'd be uncomfortable paying for it as it needs virtualbox and windows. But if it could be installed thought the ubuntu software centre, I wouldn't hesitate for a moment.

Dreamer Fithp Apprentice
January 20th, 2012, 02:19 AM
I have tried to get Ivona to work also. Never had any luck with Wine at all. In a Virtualbox with XP I can get the British voices to install and the American voices down to Salli ("down to" that is, in terms of the order in which they are listed in the installer, which is NOT alphabetical) but neither Salli nor any of the American or non-english voices past Salli would install. I don't think it is just disk space because I tried installing JUST Salli and it wouldn't work. I don't think it is a case of older voices work and newer ones don't because I think Ewa was their first voice and it is one of the ones that won't install. Go figger.

Michal Fapso
January 30th, 2012, 04:23 AM
Here is a Perl script which takes a text file as input and generates mp3 file. It uses Google TTS:

http://michalfapso.blogspot.com/2012/01/using-google-text-to-speech.html

frytek
February 6th, 2012, 12:39 PM
Here is a Perl script which takes a text file as input and generates mp3 file. It uses Google TTS:

http://michalfapso.blogspot.com/2012/01/using-google-text-to-speech.html

I wonder when Google guys realize they need captcha there. :)


BTW: anybody could use this script create a new one which would do the same using Android SVOX pico2wav? SVOX has a nice voice and runs on Ubuntu, but it accepts only very short strings. Basically, we would need splitting the text and sending it phrase by phrase in the same way. I am only a user and I can't do it myself. :(

majster
February 15th, 2012, 07:18 PM
I absolutely LOVE this script !!!! :guitar::guitar::guitar:=D>

THANK YOU MICHAL FAPŠO :)

Dreamer Fithp Apprentice
February 26th, 2012, 05:25 PM
. . . SVOX has a nice voice and runs on Ubuntu . . .

Would somebody who understands that mind clarifying it? I find some SVOX libs in synaptic but it isn't at all clear what I'm to do with them.

go_beep_yourself
May 9th, 2012, 10:19 PM
I think I know exactly what you are looking for. Check the youtube video that's on this blog for the software JSpeak.

http://linuxinnovations.blogspot.com/2012/05/jspeak-ultimate-in-linux-text-to-speach.html