Results 1 to 10 of 10

Thread: Read text from image in python

  1. #1
    Join Date
    Apr 2011
    Location
    Chittagong,Bangladesh
    Beans
    149
    Distro
    Ubuntu 14.04 Trusty Tahr

    Read text from image in python

    How can I read text from any image file using python 3.x ???Is there any module for this?

  2. #2
    Join Date
    Oct 2007
    Beans
    1,914
    Distro
    Lubuntu 12.10 Quantal Quetzal

    Re: Read text from image in python

    Do you mean Optical character recognition (OCR) or does your question refer to reading something like embedded comment texts from the image files?

  3. #3
    Join Date
    Apr 2011
    Location
    Chittagong,Bangladesh
    Beans
    149
    Distro
    Ubuntu 14.04 Trusty Tahr

    Re: Read text from image in python

    yes, i mean OCR

  4. #4
    Join Date
    Oct 2007
    Beans
    1,914
    Distro
    Lubuntu 12.10 Quantal Quetzal

    Re: Read text from image in python

    This is what a quick google search yielded for me: http://code.google.com/p/pytesser/ and http://stackoverflow.com/questions/5...odule-in-linux - Looks like a usable solution.

  5. #5
    Join Date
    Apr 2011
    Location
    Chittagong,Bangladesh
    Beans
    149
    Distro
    Ubuntu 14.04 Trusty Tahr

    Re: Read text from image in python

    But http://code.google.com/p/pytesser/ is not for python 3.x ,it has been tested with Python 2.4 ....

  6. #6
    Join Date
    Oct 2007
    Beans
    1,914
    Distro
    Lubuntu 12.10 Quantal Quetzal

    Re: Read text from image in python

    The first answer in http://stackoverflow.com/questions/5...odule-in-linux contains a code snippet that is so short that it should be easy to adapt to Python 3. Perhaps even no modification is necessary.

  7. #7
    Join Date
    Apr 2011
    Location
    Chittagong,Bangladesh
    Beans
    149
    Distro
    Ubuntu 14.04 Trusty Tahr

    Re: Read text from image in python

    thnx

  8. #8
    Join Date
    Dec 2007
    Location
    Behind you!!
    Beans
    978
    Distro
    Ubuntu 10.04 Lucid Lynx

    Re: Read text from image in python

    This looks interesting - pulled from the previous stack overflow link

    http://code.google.com/p/ocropus/
    computer-howto
    Linux is not windows
    Fluxbox & Flux menu how to
    Programming is an art. Learn it, Live it, Love it!


  9. #9
    Join Date
    Apr 2011
    Location
    Chittagong,Bangladesh
    Beans
    149
    Distro
    Ubuntu 14.04 Trusty Tahr

    Re: Read text from image in python

    thnx Bodsda

  10. #10
    Join Date
    Jan 2013
    Beans
    1

    Re: Read text from image in python

    As of Jan 2013 OCRopus and Tesseract do not work well with input from digital cameras or screenshots.
    The OCRopus says it's designed to use input from scanners using 300 to 600 dpi and the FAQ says:
    .....
    Out of the box, it will work poorly on other kinds of inputs, although you may be able to adapt it.
    Inputs it will not work on are:
    • handwriting
    • unprocessed digital camera-captured documents
    • text in photographic images
    • CAPTCHAs
    ......
    Hi res photos of printed text might work - I haven't tried this.

    I have tried using Tesseract on screenshots and it fails miserably. This is mostly due to the way Windows' CoolType and Apple's anti-aliasing render text by what's called Font Smoothing or Sub-pixel Rendering. They essentially treat the Red Green and Blue dots that are in each pixel as white dots to give them 3x the resolution. This is why you see colors halos around some text if you look closely.

    Unfortunately I have still not found an open source code library that handles images. Nor have I found a good description of good algorithms for this. They must be out there because there are numerous apps and even MS OneNote that will OCR from photos.

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •