PDA

View Full Version : PDF to ODF convertor??



kevdog
October 28th, 2007, 06:20 PM
Any suggestions how to convert pdf to odf?? I can do pdf2ps, but kind of stuck after that!!

SunnyRabbiera
October 28th, 2007, 06:42 PM
well if its a text based PDF you could just copy the text...

kevdog
October 28th, 2007, 07:11 PM
I tried that -- it didnt exactly work -- all the text was copied but the formatting was destroyed.

Anyone ever used this?? unoconv
http://dag.wieers.com/home-made/unoconv/

Martje_001
October 28th, 2007, 07:16 PM
http://media-convert.com/

?

ssam
October 28th, 2007, 07:27 PM
its not a trivial thing to do. PDF does not contain the structual information that a word processor document does. however there is work on an openoffice pdf import

have a read of
http://wiki.services.openoffice.org/wiki/Writer/ToDo/PDF_Import

also have a look at kword, i think it has basic pdf import

kevdog
October 28th, 2007, 09:37 PM
Ill take a look at kword, and thanks for the heads-up about formatting information not contained in the pdf.

Has anyone used, or know anything about this project?? It looks interesting although I dont know how to use it.
http://dag.wieers.com/home-made/unoconv/

pagingmrherman
November 28th, 2007, 10:49 PM
It works but it's flaky as hell under gutsy. I use it to convert odt files to MS Word doc files. It's the only thing in the world that can do it in a shell script.

It uses the the UNO bindings for openoffice and there's a bug in gutsy that causes it to crash. Practically every day I have to run:

sudo ldconfig -v /usr/lib/openoffice/program

to get it to work.

Also, if soffice is running in headless mode, you can't run any openoffice programs. You have to have to do a

killall soffice

However, as long as you did the ldconfig and have soffice running in headless mode, you can convert documents using

unoconv -f doc myfile.odt

and most of the time it spits out myfile.doc. Sometimes it does a core dump though and I have to redo the ldconfig and/or restart soffice in headless mode.

kevdog
November 29th, 2007, 05:07 AM
Sounds like an unreliable solution to me. Thanks for the headsup.

fluteflute
November 29th, 2007, 06:29 PM
If you just want to edit the pdf you can use pfedit

http://www.getdeb.net/app.php?name=PDF+Editor

hanzomon4
November 29th, 2007, 07:35 PM
ODF? What is that, an oss PDF-like format?

ssam
November 29th, 2007, 08:07 PM
ODF? What is that, an oss PDF-like format?

ISO standard for spreadsheets, charts, presentations and word processing documents.

http://en.wikipedia.org/wiki/OpenDocument

openoffice's default format. used by most opensource word processors, and a few closed source ones.

hanzomon4
November 29th, 2007, 08:29 PM
Oh!! I see, it's the catchall name for the various extensions and such like *.odt.

ssam
November 30th, 2007, 09:47 AM
this was posted yesterday http://www.linux.com/feature/122195

the goal is to write a powerful pdf library and then to build an acrobat type aplication on top of it.

kevdog
December 1st, 2007, 01:29 AM
Thanks for the update -- I hope that project takes off!

bruce89
December 1st, 2007, 01:56 AM
pdftotext might be of use.

pdftoabw seems to convert PDF to AbiWord's format.

Both are in poppler-utils (installed by default).