Best Open Source OCR Software
Best Open Source OCR Software
OCR stand for Optical Character Recognition is a technology that is used to convert different types of documents, such as scanned paper documents, PDF files or images captured by a digital camera into editable and search able data. We can also say that it is the electronic conversion of images of typed, handwritten or printed text into machine-encoded text. It is very useful technology, you can utilize this OCR technology in the form of data entry from printed data records such as computerized receipts, business cards,passport documents, bank statements, invoice or any other documentation.
OCR software is very useful when you need to edit some extra information or anything in scan documents. Open source OCR has benefit is little more,because its free of cost and another major benefit is open source OCR software's source code is available you can change some function according to your needs. Lets see some of the best open source Optical Character Recognition software solution.
Tesseract is an excellent open source Optical Character Recognition (OCR) software solution. It is has been improved and maintained by Google. It is designed as to supported multiple platforms like Linux, Windows and OS X. It is supported 60 different languages. It has been released under the Apache License 2.0. It is little complicated in term of use but it produced very accurate results. It provide the number of features to their user some of these are used static classifier outline fragments, recognize broken characters, fully unicode (UTF-8) capable, used UNLV regression test framework and many more.
Download link: https://code.google.com/p/tesseract-ocr/downloads/list
GOCR is another open source Optical Character Recognition software solution. It is released under the GNU Public License. It is supported multiple platform like Linux, Windows and OS/2. GOCR very beautifully convert the scanned images of text back to text files, it also recognize the letters and numbers contained in an image file. It is not only for recognize the character but also enable you to convert them so as to become editable using any text processing application. GOCR can read pnm, pbm, pgm, ppm, some pcx and tga image files. It generates very accurate result but it also difficult in term of use.
Download link: http://jocr.sourceforge.net/download.html
CuneiForm is an open source user friendly Optical Character Recognition software solution. Originally it was released as commercial OCR after few year later it released as freeware on December 12,2007. Now it is available under open source BSD license. It is supported multiple languages and supported cross platform. This application is very easy to use as compare to Tesseract and GOCR. It enable you to upload images from local folder or from scanning device. It also supported different type of images format like JPG, BMP or PNG. It has the ability to recognize table of different structure and different type of fonts. It included the features of dictionary verification to enhance accuracy of recognition.
Download link: http://www.brothersoft.com/cuneiform-ocr-pro-9982.html