View Full Version : PDF to ODF convertor??

October 28th, 2007, 06:20 PM
Any suggestions how to convert pdf to odf?? I can do pdf2ps, but kind of stuck after that!!

October 28th, 2007, 06:42 PM
well if its a text based PDF you could just copy the text...

October 28th, 2007, 07:11 PM
I tried that -- it didnt exactly work -- all the text was copied but the formatting was destroyed.

Anyone ever used this?? unoconv

October 28th, 2007, 07:27 PM
its not a trivial thing to do. PDF does not contain the structual information that a word processor document does. however there is work on an openoffice pdf import

have a read of

also have a look at kword, i think it has basic pdf import

October 28th, 2007, 09:37 PM
Ill take a look at kword, and thanks for the heads-up about formatting information not contained in the pdf.

Has anyone used, or know anything about this project?? It looks interesting although I dont know how to use it.

November 28th, 2007, 10:49 PM
It works but it's flaky as hell under gutsy. I use it to convert odt files to MS Word doc files. It's the only thing in the world that can do it in a shell script.

It uses the the UNO bindings for openoffice and there's a bug in gutsy that causes it to crash. Practically every day I have to run:

sudo ldconfig -v /usr/lib/openoffice/program

to get it to work.

Also, if soffice is running in headless mode, you can't run any openoffice programs. You have to have to do a

killall soffice

However, as long as you did the ldconfig and have soffice running in headless mode, you can convert documents using

unoconv -f doc myfile.odt

and most of the time it spits out myfile.doc. Sometimes it does a core dump though and I have to redo the ldconfig and/or restart soffice in headless mode.

November 29th, 2007, 05:07 AM
Sounds like an unreliable solution to me. Thanks for the headsup.

November 29th, 2007, 06:29 PM
If you just want to edit the pdf you can use pfedit


November 29th, 2007, 07:35 PM
ODF? What is that, an oss PDF-like format?

November 29th, 2007, 08:07 PM
ISO standard for spreadsheets, charts, presentations and word processing documents.


openoffice's default format. used by most opensource word processors, and a few closed source ones.

November 29th, 2007, 08:29 PM
Oh!! I see, it's the catchall name for the various extensions and such like *.odt.

November 30th, 2007, 09:47 AM
this was posted yesterday http://www.linux.com/feature/122195

the goal is to write a powerful pdf library and then to build an acrobat type aplication on top of it.

December 1st, 2007, 01:29 AM
Thanks for the update -- I hope that project takes off!

December 1st, 2007, 01:56 AM
pdftotext might be of use.

pdftoabw seems to convert PDF to AbiWord's format.

Both are in poppler-utils (installed by default).