Page 1 of 2 12 LastLast
Results 1 to 10 of 16

Thread: PDF Support (Re: OCR)

  1. #1
    Join Date
    Jan 2011
    Beans
    24

    PDF Support (Re: OCR)

    I have spent the last few days doing a lot of searching for Linux solutions for using OCR in PDF's. From everything I have read it requires scripts, converting, etc and quite honestly is far too much effort with everything else I have to do when trying to get my assignments done for 6 classes.

    My question is has anyone released a PDF reader in more recent times that supports OCR? I have installed and tried so much stuff. I was honestly hoping OpenOffice would support it, but don't see it there either.

    How come there is such a lacking support for this?

  2. #2
    Join Date
    Mar 2007
    Location
    Portsmouth, UK
    Beans
    Hidden!
    Distro
    Ubuntu 11.10 Oneiric Ocelot

    Re: PDF Support (Re: OCR)

    In my experience, there isn't a great demand for OCR software. If it's scanning a printed document, then a digital copy likely exists already; if it's hand-written, then I've yet to see software that was any good at interpreting it.

    There are some good options and comparisons here:

    http://www.splitbrain.org/blog/2010-...are_comparison

  3. #3
    Join Date
    Feb 2011
    Beans
    11
    Distro
    Ubuntu

    Re: PDF Support (Re: OCR)

    Quote Originally Posted by xtremezj View Post
    I have spent the last few days doing a lot of searching for Linux solutions for using OCR in PDF's. From everything I have read it requires scripts, converting, etc and quite honestly is far too much effort with everything else I have to do when trying to get my assignments done for 6 classes.

    My question is has anyone released a PDF reader in more recent times that supports OCR? I have installed and tried so much stuff. I was honestly hoping OpenOffice would support it, but don't see it there either.

    How come there is such a lacking support for this?
    I did read that libreoffice supports pdf files. Did you tried libreoffice?

  4. #4
    Join Date
    Jan 2011
    Beans
    24

    Re: PDF Support (Re: OCR)

    Quote Originally Posted by Grenage View Post
    In my experience, there isn't a great demand for OCR software. If it's scanning a printed document, then a digital copy likely exists already; if it's hand-written, then I've yet to see software that was any good at interpreting it.

    There are some good options and comparisons here:

    http://www.splitbrain.org/blog/2010-...are_comparison
    I'll take a look at that. The problem with scanned documents is some of my professor's use them exclusively from self generated material, or original documents we can't get our hands on. Since I am a Psychology major it's a lot of reading and I can't search it.

  5. #5
    Join Date
    Jan 2011
    Beans
    24

    Re: PDF Support (Re: OCR)

    Quote Originally Posted by manclip View Post
    I did read that libreoffice supports pdf files. Did you tried libreoffice?
    Lots of things support PDF, it's the OCR thats missing.

  6. #6
    Join Date
    Mar 2007
    Location
    Portsmouth, UK
    Beans
    Hidden!
    Distro
    Ubuntu 11.10 Oneiric Ocelot

    Re: PDF Support (Re: OCR)

    Quote Originally Posted by xtremezj View Post
    I'll take a look at that. The problem with scanned documents is some of my professor's use them exclusively from self generated material, or original documents we can't get our hands on. Since I am a Psychology major it's a lot of reading and I can't search it.
    I feel your pain; unfortunately I mis-read your original post, and the link I gave you refers to document formats other than PDF. If I come across anything that looks suitable, I'll post back.

  7. #7
    Join Date
    Jan 2011
    Beans
    24

    Re: PDF Support (Re: OCR)

    Thanks man

  8. #8
    Join Date
    Mar 2007
    Location
    Portsmouth, UK
    Beans
    Hidden!
    Distro
    Ubuntu 11.10 Oneiric Ocelot

    Re: PDF Support (Re: OCR)

    If your main issue in PDFs that can't be searched (images), this looks like it might be an option. It still requires a command line, but one could probably make a simply script that converts all documents in a folder (and runs with a link/shortcut).

  9. #9
    Join Date
    Jan 2011
    Beans
    24

    Re: PDF Support (Re: OCR)

    Sweet man, I'll read up on that today and let you know how it works out

  10. #10
    Join Date
    Jul 2005
    Location
    England
    Beans
    Hidden!

    Re: PDF Support (Re: OCR)

    There are also some online ocr sites which can "scan and ocr" your pdf files making them into text or word docs, eg
    http://www.newocr.com/
    http://www.onlineocr.net/default.aspx
    I have used both just to try them out with a variety of image formats and they seem very good and accurate, though I think there are number of document limits per user unless you pay or are registered.

    I am also now using tesseract v3 for any ocr I do locally. It is superbly accurate in comparison with gocr, which I find a waste of time generally. You could always open your pdf files in gimp and save them as tif files to use tesseract in order to make searchable txt files. A bit of a palaver, but technically possible.
    DISTRO: Xubuntu 12.04-64bit --- Code-tags --- Boot-Repair --- Grub2 wiki & Grub2 Basics --- RootSudo

Page 1 of 2 12 LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •