Page 15 of 19 FirstFirst ... 51314151617 ... LastLast
Results 141 to 150 of 181

Thread: HOWTO: Make festival TTS use better voices (MBROLA / CMU / HTS)

  1. #141
    Join Date
    Mar 2010
    Location
    Lunar Base VII- Sector IX
    Beans
    1,745
    Distro
    Ubuntu 13.04 Raring Ringtail

    Re: HOWTO: Make festival TTS use better voices (MBROLA / CMU / HTS)

    Not sure what to do there, sorry mate.

    However, Since using 11.04, i have discovered a program/app called "Gespeaker"
    Its got a nice GUI, and you can choose from voices, languages, etc.
    Just type it in, and press play.

    Check out "Gespeaker" in software center.l
    Pretty decent!
    5.5 GB's Free Ubuntu-One Cloud Storage~
    Click Here
    _______________________________________________

  2. #142
    Join Date
    Jan 2008
    Beans
    56

    Re: HOWTO: Make festival TTS use better voices (MBROLA / CMU / HTS)

    On this forum there's also a script for converting txt do speech with espeak (+ mbrola)

    http://ubuntuforums.org/showthread.p...eech+synthesis

  3. #143
    Join Date
    Nov 2007
    Beans
    2

    Re: HOWTO: Make festival TTS use better voices (MBROLA / CMU / HTS)

    I succeeded in getting festival to work with the HTS voices on Ubuntu 11.04. I simply installed the festival package from lucid [1] and followed the procedure for installing the HTS 2.1 voices. You need one additional dependency from lucid, libestools1.2. Simply download these two packages and install the .debs manually with dpkg -i.

    [1] https://launchpad.net/ubuntu/+source...+build/1335123

  4. #144
    Join Date
    Jan 2011
    Location
    Chicago, IL
    Beans
    13
    Distro
    Ubuntu 10.10 Maverick Meerkat

    Re: HOWTO: Make festival TTS use better voices (MBROLA / CMU / HTS)

    Quote Originally Posted by redaxe View Post
    I am new to Festival and New to Unix/Linux envoirnment.

    I have Setup Festival 2.1 on Ubuntu 10.10 successfully with instruction given in INSTALL file.
    I am also looking for some help using those versions of ubuntu and Festival...

    Thanks

  5. #145
    Join Date
    Oct 2011
    Beans
    1

    Re: HOWTO: Make festival TTS use better voices (MBROLA / CMU / HTS)

    Hi,

    I got one of the HTS 2.2 voices compiled, trained (arctic_slt_hts) and then working with festival 2.1. Compilation took about 8 hours, training ~30 hours. From my perspective it was worth it. I did not use Ubuntu for that, but Archlinux - no flame please -, but the produced voice should be usable on Ubuntu as well, just drop the tarball's content in /usr/share/festival/voices/us/ and you should be good to go.

    You can get the tarball with the voice here:
    http://dl.dropbox.com/u/1845335/rele...tic_hts.tar.gz

    And if you want to hear the difference first I let the old version (The latest prebuild one from Nitech HTS) and the new version speak the first two paragraphs of this article:
    http://en.wikinews.org/wiki/Eyewitne...fatal_protests

    You can get the two mp3s here:
    Old: http://dl.dropbox.com/u/1845335/release/news_old.mp3
    New: http://dl.dropbox.com/u/1845335/release/news_new.mp3

    Personally I think the new version sounds much smoother (It is compiled & trained with the default options).

    Now what this should boil down to is, that you will most likely not need to install festival from source anymore and can just use the normal packages and still have access to the newest Nitech voices (provided you are willing to compile & train them or use the one I posted).

  6. #146
    Join Date
    Oct 2011
    Beans
    1

    Smile Re: HOWTO: Make festival TTS use better voices (MBROLA / CMU / HTS)

    Hi Calrama:

    Nice job.

    The voice works well on Ubuntu Oneiric 11.10

    To make this the default voice I did:
    $ sudo gedit /usr/share/festival/voices.scm

    (defvar default-voice-priority-list
    '(;nitech_us_slt_arctic_hts ; [Error: HTS_Model_load_pdf: Failed to load header of pdfs.]
    ;kal_diphone
    ;cmu_us_bdl_arctic_hts
    ;cmu_us_jmk_arctic_hts
    cmu_us_slt_arctic_hts ; Custom compile
    ;cmu_us_awb_arctic_hts
    ; cstr_rpx_nina_multisyn ; restricted license (lexicon)
    ; ...

    Thanks again

  7. #147
    Join Date
    Dec 2008
    Location
    Isolated Digital Reign
    Beans
    151
    Distro
    Kubuntu Development Release

    Re: HOWTO: Make festival TTS use better voices (MBROLA / CMU / HTS)

    @Calrama: THX so much for investing the time & resources as well as sharing the voice with all of us!

    To all of those, who experience the following message:
    Code:
    Error: HTS_Model_load_pdf: Failed to load header of pdfs.
    The Debian guys already have a "fix" for this by including the old HTS engine and publishing it as a module named "hts21compat".

    You can find all information in this bug: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=589614

    So, if no appropriate bug is listed here, you probably might want to go ahead and file a bug for including that patch. This way, the available pre-trained HTS-2.1 voices would remain working while the new ones also do with the new hts_engine.

  8. #148
    Join Date
    Oct 2008
    Beans
    171
    Distro
    Ubuntu 11.10 Oneiric Ocelot

    Re: HOWTO: Make festival TTS use better voices (MBROLA / CMU / HTS)

    Quote Originally Posted by CHaoSlayeR View Post
    @Calrama: THX so much for investing the time & resources as well as sharing the voice with all of us!

    To all of those, who experience the following message:
    Code:
    Error: HTS_Model_load_pdf: Failed to load header of pdfs.
    The Debian guys already have a "fix" for this by including the old HTS engine and publishing it as a module named "hts21compat".

    You can find all information in this bug: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=589614

    So, if no appropriate bug is listed here, you probably might want to go ahead and file a bug for including that patch. This way, the available pre-trained HTS-2.1 voices would remain working while the new ones also do with the new hts_engine.
    That is good news, thanks for the heads-up.

    Has anyone managed to build festival from the Debian patch? If so, a short how-to would be much appreciated!

    Thanks,
    Mike

    EDIT: I managed to build it myself, it goes something like this:

    Code:
    $ mkdir whatever
    $ cd whatever
    $ git clone git://anonscm.debian.org/tts/festival.git
    $ tar cvf festival_2.1~release.orig.tar festival
    $ gzip festival_2.1~release.orig.tar
    $ cd festival
    $ debuild
    $ cd ..
    $ sudo dpkg --install festival_2.1~release-2.2_i386.deb
    That's it. You get some errors about signing in the end of the build that can be ignored.

    You have to install some dependencies of course, but it was easy.

    Read carefully the link above about changing the files
    Code:
    /usr/share/festival/voices/us/nitech_us_XXX_arctic_hts/festvox/nitech_us_XXX_arctic_hts.scm
    to make it work with the backward compatibility module that was added.

    Cheers,
    Mike
    Last edited by SwedishWings; December 3rd, 2011 at 02:04 AM.
    Some scientists claim that hydrogen, because it is so plentiful, is the basic building block of the universe. I dispute that. I say there is more stupidity than hydrogen, and that is the basic building block of the universe.
    -- Frank Zappa

  9. #149
    Join Date
    Jun 2007
    Beans
    5
    Distro
    Kubuntu 7.04 Feisty Fawn

    Re: HOWTO: Make festival TTS use better voices (MBROLA / CMU / HTS)

    I wasn't able to find some of the festvox files at the location in the Howto but was able to find them here:

    http://www.speech.cs.cmu.edu/festiva...estival/1.4.0/

    Hope this helps someone out. Otherwise, despite it's age a very helpful article.

  10. #150

    Re: HOWTO: Make festival TTS use better voices (MBROLA / CMU / HTS)

    Wow. Spent over 40 total hours compiling this (HTS-demo_CMU-ARCTIC-SLT) over and over working out the kinks only to realize that these are not the same as the Nitech voices**. =D>

    At any rate, one of the errors I kept getting was the same as mrplow (quoted below). As it turns out, SoX dropped the depreciated '-w' switch in version 14.1.0, resulting in the help text output and overall failure of 'Training.pl'. (See http://sox.git.sourceforge.net/git/g...8b7df334606820.)

    In essence, this means that there is one more dependency that is not made known in the 'INSTALL' ReadMe file: SoX 14.0.1 or earlier. (Go figure.) I suppose this could be fixed by editing the 'Config.pl' in the 'scripts' directory to use the newer '-2' switch instead of '-w' mrplow recommends, but this is how I fixed it:
    ____________

    First, download the source for SoX 14.0.1 (the latest version that still supports the '-w' switch):

    Code:
    cd ~/Downloads
    wget -c http://sourceforge.net/projects/sox/files/sox/14.0.1/sox-14.0.1.tar.gz
    Then, extract its contents, configure it, make it, and install it:

    Code:
    tar xvf sox-14.0.1.tar.gz
    cd sox-14.0.1/
    ./configure
    make
    sudo make install
    This should have installed the binaries to '/usr/local/bin', BUT SoX will fail to link to its library file 'libsox.so.0' in '/usr/local/lib'! To fix this, run:

    Code:
    LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
    export LD_LIBRARY_PATH
    (You'll have to do this every time you build the voice.)

    Now you should have the right version of SoX installed. If it still isn't using the right version of SoX, uninstall any other versions of SoX on your system (sudo apt-get remove sox) and make sure '/usr/local/bin' is set in the PATH variable. Heck, I recommend you just uninstall any other versions of SoX before you begin compiling, just in case. (It's not worth it to build for 20+ hours to run into the same error again.) Besides, you can always re-install the official Ubuntu SoX package when you're done.

    You should now be able to compile HTS-demo_CMU-ARCTIC-SLT, though, as I said before, the CMU Arctic voices are not the same as the Festvox Nitech voices (which I find to be superior and the ones that I believe mrplow was really looking for). As far as I know, there are no Nitech voices built for Festival 2.X**. If you want to use the Nitech voices, you should probably just grit your teeth and use an older version of Festival (1.96 or sooner)**.

    Happy compiling!

    **EDIT: Gaah! Shame on me for trying to write a forum post at 2 in the morning. If you want to use the Nitech voices for Festival 2.0.96 or later, see CHaoSlayeR's post above. (Literally, just scroll up.) You'll need to build a compatibility patch into Festival. For information on how to do that, see SwedishWing's post right below it. I need some coffee.
    ____________

    Quote Originally Posted by digitaltoast View Post
    I had this same problem - until I found a post that suggested that it's because festival 2.095 requires HTS 2.1.1 voices, which can be found here:
    http://hts.sp.nitech.ac.jp/archives/2.1.1/

    But it's not straightforward! The whole Festival system seems to be designed to be complicated and keep non-geeks out!

    Want to try the 2.1.1 voices? You need to do this:

    Code:
    * Installation of HTS-demo_CMU-ARCTIC-SLT
    ==========================================
    
    1. HTS-demo_CMU-ARCTIC-SLT requires Festival, SPTK-3.3, HTS-2.1.1, hts_engine API-1.03, and OpenFst-1.1.
       Please install them before running this demo.
       You can download them from the following websites:
    
       Festival: http://www.cstr.ed.ac.uk/projects/festival/
       SPTK: http://sp-tk.sourceforge.net/
       HTS: http://hts.sp.nitech.ac.jp/
       hts_engine API: http://hts-engine.sourceforge.net/
       OpenFst: http://www.openfst.org/
    
       In HTS-demo_CMU-ARCTIC-SLT, a simple F0 extraction script written in Tcl/Tk is included.
       This script calls get_f0 function implemented in the open-source speech toolkit Snack.
       Therefore, HTS-demo_CMU-ARCTIC-SLT also requires Tcl/Tk with Snack.
       ActiveState (http://www.activestate.com/) provides a Tcl/Tk distribution named ActiveTcl
       for many platforms.  You can download it from
    
       ActiveTcl: http://downloads.activestate.com/ActiveTcl/
    
       The above distribution includes Snack and it is easy to install and use.
       We recommend you to use this to run this demonstration
       (Of course you can use your own tcl/tk with Snack).
       Note that ActiveTcl 8.5 doesn't include Snack, please use ActiveTcl 8.4.
    
    
    2. Setup HTS-demo_CMU-ARCTIC-SLT by running configure script:
    
       % cd HTS-demo_CMU-ARCTIC-SLT
       % ./configure --with-tcl-search-path=/usr/local/ActiveTcl/bin \
                     --with-fest-search-path=/usr/local/festival/examples \
                     --with-sptk-search-path=/usr/local/SPTK-3.3/bin \
                     --with-hts-search-path=/usr/local/HTS-2.1.1_for_HTK-3.4.1/bin \
                     --with-hts-engine-search-path=/usr/local/hts_engine_API-1.03/bin \
                     --with-openfst-search-path=/usr/local/openfst-1.1/bin
    
       Please adjust the above directories for your environment.
       Note that you should specify festival/examples rather than festival/bin.
    
       You can change various parameters such as speech analysis conditions and model training conditions
       through ./configure arguments.  For example
    
       % ./configure MGCORDER=24 GAMMA=0 FREQWARP=0.0              (24-th order cepstrum)
       % ./configure MGCORDER=24 GAMMA=0 FREQWARP=0.42             (24-th order Mel-cepstrum)
    
       % ./configure MGCORDER=12 GAMMA=1 FREQWARP=0.0  LNGAIN=0    (12-th order LSP,     linear gain)
       % ./configure MGCORDER=12 GAMMA=1 FREQWARP=0.0  LNGAIN=1    (12-th order LSP,     log gain)
       % ./configure MGCORDER=12 GAMMA=1 FREQWARP=0.42 LNGAIN=1    (12-th order Mel-LSP, log gain)
       % ./configure MGCORDER=12 GAMMA=3 FREQWARP=0.42 LNGAIN=1    (12-th order MGC-LSP, log gain)
    
       % ./configure NSTATE=7 NITER=10 WFLOOR=5   (# of HMM states=7, # of EM iterations=10, mix weight floor=5)
    
       Please refer to the help message for details:
    
       % ./configure --help
    
    
    3. Start running demonstration as follows:
    
       % cd HTS-demo_CMU-ARCTIC-SLT
       % make
    
       After composing training data, HMMs are estimated and speech waveforms are synthesized.
       It takes about 12 to 18 hours :-)
    12 to 18 HOURS??? And I don't even know what I'm going to end up with. What does "DEMO" mean? Does it just say something and stop? Also, do I want
    http://hts.sp.nitech.ac.jp/archives/...-ADAPT.tar.bz2
    or
    http://hts.sp.nitech.ac.jp/archives/...RAIGHT.tar.bz2
    ?

    It's not the 492Mb of each file I mind, it's the idea of spending 12-18 hours building one to find I wanted the other one!

    The only manual I can find for Festival is here:
    http://www.cstr.ed.ac.uk/projects/festival/manual/
    Dated 1999, for version 1.4

    I sometimes feel like I've missed the basics somewhere.
    Were it not for threads like this I'd be completely lost!
    Quote Originally Posted by mrplow View Post
    well that was fun, I'm not sure how far I made it but I eventually ran into this error 70 hours into compiling
    Code:
    ====================================================================================
    Start synthesizing waveforms (speaker independent) at Thu Nov 11 18:38:44 PST 2010
    ====================================================================================
    
    Processing directory /home/mrplow/Desktop/HTS/HTS-demo_CMU-ARCTIC-ADAPT/HTS-demo_CMU-ARCTIC-ADAPT/gen/qst001/ver1/SI/0:
     Synthesizing a speech waveform from cmu_us_arctic_slt_alice01.mgc and cmu_us_arctic_slt_alice01.lf0.../usr/bin/sox: invalid option -- w
    /usr/bin/sox FAIL sox: invalid option
    
    /usr/bin/sox: SoX v14.3.1
    
    Usage summary: [gopts] [[fopts] infile]... [fopts] outfile [effect [effopt]]...
    
    SPECIAL FILENAMES (infile, outfile):
    -                        Pipe/redirect input/output (stdin/stdout); may need -t
    -d, --default-device     Use the default audio device (where available)
    -n, --null               Use the `null' file handler; e.g. with synth effect
    -p, --sox-pipe           Alias for `-t sox -'
    
    SPECIAL FILENAMES (infile only):
    "|program [options] ..." Pipe input from external program (where supported)
    http://server/file       Use the given URL as input file (where supported)
    
    GLOBAL OPTIONS (gopts) (can be specified at any point before the first effect):
    --buffer BYTES           Set the size of all processing buffers (default 8192)
    --clobber                Don't prompt to overwrite output file (default)
    --combine concatenate    Concatenate all input files (default for sox, rec)
    --combine sequence       Sequence all input files (default for play)
    -D, --no-dither          Don't dither automatically
    --effects-file FILENAME  File containing effects and options
    -G, --guard              Use temporary files to guard against clipping
    -h, --help               Display version number and usage information
    --help-effect NAME       Show usage of effect NAME, or NAME=all for all
    --help-format NAME       Show info on format NAME, or NAME=all for all
    --i, --info              Behave as soxi(1)
    --input-buffer BYTES     Override the input buffer size (default: as --buffer)
    --no-clobber             Prompt to overwrite output file
    -m, --combine mix        Mix multiple input files (instead of concatenating)
    -M, --combine merge      Merge multiple input files (instead of concatenating)
    --magic                  Use `magic' file-type detection
    --multi-threaded         Enable parallel effects channels processing (where
                             available)
    --norm                   Guard (see --guard) & normalise
    --play-rate-arg ARG      Default `rate' argument for auto-resample with `play'
    --plot gnuplot|octave    Generate script to plot response of filter effect
    -q, --no-show-progress   Run in quiet mode; opposite of -S
    --replay-gain track|album|off  Default: off (sox, rec), track (play)
    -R                       Use default random numbers (same on each run of SoX)
    -S, --show-progress      Display progress while processing audio data
    --single-threaded        Disable parallel effects channels processing
    --temp DIRECTORY         Specify the directory to use for temporary files
    --version                Display version number of SoX and exit
    -V[LEVEL]                Increment or set verbosity level (default 2); levels:
                               1: failure messages
                               2: warnings
                               3: details of processing
                               4-6: increasing levels of debug messages
    FORMAT OPTIONS (fopts):
    Input file format options need only be supplied for files that are headerless.
    Output files will have the same format as the input file where possible and not
    overriden by any of various means including providing output format options.
    
    -v|--volume FACTOR       Input file volume adjustment factor (real number)
    --ignore-length          Ignore input file length given in header; read to EOF
    -t|--type FILETYPE       File type of audio
    -s/-u/-f/-U/-A/-i/-a/-g  Encoding type=signed-integer/unsigned-integer/floating
                             point/mu-law/a-law/ima-adpcm/ms-adpcm/gsm-full-rate
    -e|--encoding ENCODING   Set encoding (ENCODING in above list)
    -b|--bits BITS           Encoded sample size in bits
    -1/-2/-3/-4/-8           Encoded sample size in bytes
    -N|--reverse-nibbles     Encoded nibble-order
    -X|--reverse-bits        Encoded bit-order
    --endian little|big|swap Encoded byte-order; swap means opposite to default
    -L/-B/-x                 Short options for the above
    -c|--channels CHANNELS   Number of channels of audio data; e.g. 2 = stereo
    -r|--rate RATE           Sample rate of audio
    -C|--compression FACTOR  Compression factor for output format
    --add-comment TEXT       Append output file comment
    --comment TEXT           Specify comment text for the output file
    --comment-file FILENAME  File containing comment text for the output file
    --no-glob                Don't `glob' wildcard match the following filename
    
    AUDIO FILE FORMATS: 8svx aif aifc aiff aiffc al amb amr-nb amr-wb anb au avr awb caf cdda cdr cvs cvsd cvu dat dvms f32 f4 f64 f8 fap flac fssd gsm gsrt hcom htk ima ircam la lpc lpc10 lu mat mat4 mat5 maud nist ogg paf prc pvf raw s1 s16 s2 s24 s3 s32 s4 s8 sb sd2 sds sf sl smp snd sndfile sndr sndt sou sox sph sw txw u1 u16 u2 u24 u3 u32 u4 u8 ub ul uw vms voc vorbis vox w64 wav wavpcm wv wve xa xi
    PLAYLIST FORMATS: m3u pls
    AUDIO DEVICE DRIVERS: alsa
    
    EFFECTS: allpass band bandpass bandreject bass bend biquad chorus channels compand contrast crop+ dcshift deemph delay dither divide+ earwax echo echos equalizer fade filter* fir firfit+ flanger gain highpass input# key* ladspa loudness lowpass mcompand mixer noiseprof noisered norm oops output# overdrive pad pan* phaser pitch polyphase* rabbit* rate remix repeat resample* reverb reverse riaa silence sinc spectrogram speed splice stat stats stretch swap synth tempo treble tremolo trim vad vol
      * Deprecated effect    + Experimental effect    # LibSoX-only effect
    EFFECT OPTIONS (effopts): effect dependent; see --help-effect
    Error in /usr/local/SPTK/bin/excite -p 80 /home/mrplow/Desktop/HTS/HTS-demo_CMU-ARCTIC-ADAPT/HTS-demo_CMU-ARCTIC-ADAPT/gen/qst001/ver1/SI/0/cmu_us_arctic_slt_alice01.pit | /usr/local/SPTK/bin/mglsadf -m 24 -p 80 -a 0.42 -c 0 /home/mrplow/Desktop/HTS/HTS-demo_CMU-ARCTIC-ADAPT/HTS-demo_CMU-ARCTIC-ADAPT/gen/qst001/ver1/SI/0/cmu_us_arctic_slt_alice01.mgc | /usr/local/SPTK/bin/x2x +fs | /usr/bin/sox -c 1 -s -w -t raw -r 16000 - -c 1 -s -w -t wav -r 16000 /home/mrplow/Desktop/HTS/HTS-demo_CMU-ARCTIC-ADAPT/HTS-demo_CMU-ARCTIC-ADAPT/gen/qst001/ver1/SI/0/cmu_us_arctic_slt_alice01.wav
    it can probably be fixed by changing scripts/Config.pm line 248
    $SOXOPTION = 'w';
    but I've spent enough time and I'll wait until someone tries out that new festival and reports back
    Last edited by amanisdude; March 10th, 2012 at 06:59 AM. Reason: information correction

Page 15 of 19 FirstFirst ... 51314151617 ... LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •