Results 1 to 10 of 29

Thread: Need to convert PDF to Audio MP3

Hybrid View

  1. #1
    Join Date
    Jul 2005
    Location
    Remote Desert, USA
    Beans
    683

    Need to convert PDF to Audio MP3

    My wife wants me to come up with a Linux script or program that can convert a PDF file into an audio MP3 that she can study while in her car or while working out.

    Anyone got a Perl, Bash, or other script that can do this?
    SuperMike
    When in doubt, follow the penguins.
    Evil Kitty is watching you

  2. #2
    Join Date
    Jun 2006
    Location
    UnderTheSea
    Beans
    265

    Re: Need to convert PDF to Audio MP3

    festival or gspeaker
    So She Said Internet, And I Said Inter-Not



  3. #3
    Join Date
    Jan 2010
    Beans
    10

    Re: Need to convert PDF to Audio MP3

    hi, had the same problem, in the attachement is the file called pdf2mp3.py which does the conversion of a pdf or text/ascii file into a mp3 audio file.

    i wanted to hear work-related publications while making sport during lunch brake. as there's only lot's of stupid commercial stuff in the web, i just wrote my own python-script/program that converts a pdf-file (or ascii/text-file) into an mp3 file. it should be very easy to make it work on a running ubuntu distribution
    or other linux system, you just need to install the following packages:

    Code:
     sudo apt-get install python poppler-utils festival festvox-rablpc16k lame
    then you have to use the attached file called pdf2mp3.py and make it an executable via:

    Code:
    chmod +x pdf2mp3.py
    and to be able to run that little program 'system-wide':
    Code:
    sudo cp pdf2mp3.py /usr/bin/
    then you only have to call the script in your shell via:
    Code:
    pdf2mp3.py yourfilename.pdf
    or via:
    Code:
    pdf2mp3.py yourfilename.txt
    or via:
    Code:
    pdf2mp3.py yourfilename.dat
    and an mp3 is being created!

    PS: it is the english voice now. other voices can be downloaded here: http://cslu.cse.ogi.edu/tts/download/
    Attached Files Attached Files
    Last edited by rennau80; January 8th, 2010 at 03:21 PM.

  4. #4
    Join Date
    Dec 2008
    Beans
    5

    Re: Need to convert PDF to Audio MP3

    i am having a problem........


    it says.....


    chmod: cannot access `pdf2mp3.py': No such file or directory

    please advise..........

    on following commands terminal shows...............



    nahar@nahar-desktop:~$ sudo apt-get install python poppler-utils festival festvox-rablpc16k lame (command 1)
    [sudo] password for nahar:
    Reading package lists... Done
    Building dependency tree
    Reading state information... Done
    python is already the newest version.
    poppler-utils is already the newest version.
    festival is already the newest version.
    festvox-rablpc16k is already the newest version.
    lame is already the newest version.
    The following packages were automatically installed and are no longer required:
    linux-headers-2.6.31-14 libnss3-dev libfltk1.1 libnspr4-dev
    linux-headers-2.6.31-14-generic
    Use 'apt-get autoremove' to remove them.
    0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
    nahar@nahar-desktop:~$ chmod +x pdf2mp3.py (command 2)problem starts here
    chmod: cannot access `pdf2mp3.py': No such file or directory
    nahar@nahar-desktop:~$
    Last edited by aiiaalla; January 9th, 2010 at 11:19 AM.

  5. #5
    Join Date
    Jan 2010
    Beans
    10

    Re: Need to convert PDF to Audio MP3

    seems that you downloaded the file pdf2mp3.py to a different folder. if you do a

    Code:
    ls pdf2mp3.py
    it should list the file (i think that for your case it is simply not in your path). otherwise you need to know to which directory you've downloaded the file and do the chmod-command there.

  6. #6
    Join Date
    Dec 2008
    Beans
    5

    Re: Need to convert PDF to Audio MP3

    nahar@nahar-desktop:~$ ls pdf2mp3.py
    ls: cannot access pdf2mp3.py: No such file or directory
    nahar@nahar-desktop:~$

    the file is on the desktop........

  7. #7
    Join Date
    Sep 2012
    Beans
    2

    Re: Need to convert PDF to Audio MP3

    Server not found can you attach the file again thanks

  8. #8
    Join Date
    Jan 2008
    Beans
    54

    Re: Need to convert PDF to Audio MP3

    This one works in a console. Converts txt only but there are many methods of extracting txt from pdf.

    I gave up the mbrola voices with espeak because at certain moment I realized they are less clear.

    At first they seem to sound nicer and more natural, but later you come to a conclusion that pure espeak is definitely more clear and you can listen faster.

    The problem with espeak+mbrola is that there are kind of micro-gaps between pronounced letters / words. The more I listened to it the less I liked it.

    Here's the script. Save it in a file and chmod +x it.
    The script accepst clear text only. No spaces in the filename.
    Usage is:

    script file.txt language_code - reads the file

    en for English and so on.

    or

    script file.txt language_code l

    encodes with lame. g encodes with gogo (faster. considerable difference if converting an unabridged book).


    Code:
    #!/bin/sh
    # txt2mp3 - convert text files to mp3 audio files (aka audiobooks)
    # v0.1
    #
    # (c) 2011 http://www.gnu.org/copyleft/gpl.html
    #
    # OBS.: install some pre-requisites first, with
    #       sudo apt-get install espeak lame gogo
    
    
    # espeak parameters
    speed=200	#160
    pitch=70	#50
    paragraph=40	#0
    gap=0
    lang=$2
    
    
    
    if [ "$1" = "" ]; then
        clear
        echo "No arguments. The usage is:"
        echo "$0 file.txt lang_code [g | l] "
        echo "g is for gogo, l for lame encoding."
        echo "Without g or l text will be played only."
        echo "Gogo is more than 2 times faster than lame "
        echo "and with speech encoding there are no "
        echo "substantial quality differences."
        echo "Language code is en, pl, ru and so on."
        echo "Refer to espeak manual."
        echo "Look inside the file to change speed, pitch etc."
        echo "because these are kind of fixed preferences."
        exit
    
    fi
    
    if [ "$2" = "" ]; then
        echo "Not enough arguments. The usage is:"
        echo "$0 file.txt lang_code [g | l]."
        echo "g is for gogo, l for lame encoding."
        echo "Without g or l text will be played only."
        echo "Gogo is more than 2 times faster than lame "
        echo "and with speech encoding there are no "
        echo "substantial quality differences."
        echo "Language code is en, pl, ru and so on."
        echo "Refer to espeak manual."
        echo "Look inside the file to change speed, pitch etc."
        echo "because these are kind of fixed preferences."
        exit
    
    
        exit
    fi
    
    
    if [ "$2" = g ]; then
        echo "The second argument should be the language: en, pl and so on."
        echo "The optional 'g' or 'l' goes on third position."
        exit
    fi
    
    
    
    
    # encoding
    encode="$3"    # If g or l in command line is ommited, the text will be played. 
    
    TXT_FILE="$1"
    BASENAME=`echo "$TXT_FILE" | sed 's/\(.*\)\(\....$\)/\1/g'`
    
    echo "Processing ${TXT_FILE} with TTS"
    
    
    if [ "$encode" = l ] ; then
    	espeak -f "$1" -gap $gap -s $speed -l $paragraph -p $pitch -v$lang --stdout | \
    	lame --verbose  --preset cbr 32  - "${BASENAME}_espeak-l.mp3"
    	echo "...done! Voice saved as ${BASENAME}_espeak-l.mp3"
    elif [ "$encode" = g ] ; then
    	espeak -f "$1" -gap $gap -s $speed -l $paragraph -p $pitch -v$lang --stdout | \
    	gogo stdin -b 32 -m m -emp 5 "${BASENAME}_espeak-g.mp3"
    	echo "...done! Voice saved as ${BASENAME}_espeak-g.mp3"
    else
    	espeak -f "$1" -s $speed -l $paragraph -p $pitch -v$lang
    	echo "...done! Add 'g' or 'l'  to command line to encode. "
    fi

    I also cut the file afterwards into 15 minutes chunks cause my phone much better handles them that way (better scrolling, quicker search for a chapter if necessary).

    I use mp3splt with -0 0.2 which makes an overlap of 2 sec. betwen chunks.
    Last edited by frytek; October 10th, 2012 at 04:03 PM.

  9. #9
    Join Date
    Jan 2008
    Beans
    54

    Re: Need to convert PDF to Audio MP3

    I also found some code that does the actual conversion but I don't use it as usually I convert ebooks to txt and to mp3, (not pdf, doc or odt). And and I do it from Callibre.
    Anyway, if someone would like to use this, here it goes:

    install some pre-requisites first, with
    # sudo apt-get install espeak lame xpdf-utils odt2txt antiword



    Code:
    # if it isn't a TXT file, convert it first
    if [ "$ext" != "txt" ] ; then
        TMP_FILE="/tmp/espeakfile-$$.txt"
    
        # PDF
        if [ "$ext" = "pdf" ] ; then
            echo "converting from PDF to TXT"
            pdftotext "${TXT_FILE}" "${TMP_FILE}"
        fi
    
        # ODT
        if [ "$ext" = "odt" ] ; then
            echo "converting from ODT to TXT"
            odt2txt --subst=all "${TXT_FILE}" > "${TMP_FILE}"
        fi
    
        # DOC
        if [ "$ext" = "doc" ] ; then
            echo "converting from DOC to TXT"
            antiword "${TXT_FILE}" > "${TMP_FILE}"
        fi
    
        TXT_FILE="${TMP_FILE}"
    fi
    Last edited by frytek; October 10th, 2012 at 04:04 PM.

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •