Are you looking for ways to extract text and images from a handwritten, printed or typewritten document? The easiest way to do it is to scan the document and use a Optical Character Recognition (OCR) software to extract the content.
There are several OCR software in the market.But many are commercial software and there are only a few freeware.We earlier covered SimpleOCR, a free OCR software to read, and convert a hard copy document with standard fonts, into an editable soft copy document. SimpleOCR is available as both a freeware and as a commercial version. Here we cover FreeOCR, which is both a Scanner Software and an OCR Software. It is thus a complete scan and OCR program that includes the Windows compiled Tesseract free OCR engine, also known as a Tesseract GUI.
FreeOCR is not only free but is also very easy to use. FreeOCR supports Optical Character Recognition (OCR) of multi-page Tiff, Adobe PDF and fax documents, as well as most image types including compressed Tiff. FreeOCR for scanned PDF is based on Tesseract OCR PDF engine, an open source product released by Google.
To use FreeOCR, you should have .Net Framework 2.0 installed on your PC. The underlying Tesseract OCR engine requires images at a resolution of 200 dpi or greater and it is not suited for reading PC screen-shots which are only about 72dpi. The developers recommend scanning the documents at 300 dpi grayscale (optimal level), for best results.
FreeOCR cannot read images that are upside down or rotated by 90 Degrees. Hence, make use of the rotate buttons to rotate the images before using FreeOCR on them. You can also select the text area for Optical Character Recognition (OCR), by drawing a box around it. This gives better results than trying to OCR whole pages. FreeOCR supports scanned PDFs ie. PDF’s that contain an image and gives better results from Clean scans.
Download FreeOCR and enjoy the free scanner and OCR software.