Sciology = Science + Technology

Commonsense in Technology

Posts Tagged ‘open source’

Commercial and OpenSource OCR Softwares

Posted by sureshkrishna on November 4, 2009

After testing the FineReader, OmniPage, ReadIRIS, and SimpleOCR, Aspire, Tesseract….it is evident that ABBYY FineReader 9 is the best overall value, while ReadIRIS is the best OCR software for under $150.

The main features that differentiate OCR software are:

  • Character recognition accuracy
  • Page layout reconstruction accuracy
  • Support for languages
  • Support for searchable PDF output
  • Speed
  • User interface
  • API / SDK
  • Support / Consulting
  • Stability of the engine when processing large documents
Following are some of the Softwares that i played with and compared.
SimpleOCR is the popular freeware OCR software with hundreds of thousands of users worldwide.  SimpleOCR is also a royalty-free OCR SDK for developers to use in their custom applications. If you have a scanner and want to avoid retyping your documents, SimpleOCR is the fast, free way to do it.  The SimpleOCR freeware is 100% free and not limited in any way.  Anyone can use SimpleOCR for free–home users, educational institutions, even corporate users. Our own freeware OCR application provides acceptable accuracy for those who just need to convert a few pages and can’t justify the cost of commercial OCR software.  Developers can use the command-line and SDK versions to integrate SimpleOCR with their custom applications.

 

ABBYY FineReader

FineReader Professional is a highly accurate and easy to use OCR software that includes host of features including digital camera OCR, intelligent document layouts, image enhancement, barcode recognition and command line integration.  FineReader 9 is our pick for OCR software because its document layout retention will save you much time in reformatting documents you convert for editing

IRIS ReadIRIS

Affordable OCR software for business and home users.  ReadIRIS Pro provides a extremely accurate OCR recognition rate at a low cost, but still has some of the advanced features that higher priced professional OCR software includes.

Nuance OmniPage

OmniPage is widely considered the fastest, most accurate and fully featured OCR software.  OmniPage 17 Professional has a unique new feature that lets you convert any type of document to searchable PDF or Word. OmniPage does not have a downloadable demo. Nuance also does not provide free technical support after the first call.  For these reasons we recommend the ABBYY and IRIS products instead.

OmniPage is an Optical character recognition application available from Nuance Communications. Nuance Communications was acquired by ScanSoft, which also took over its name in October 2005.OmniPage converts images such as scanned paper documents, and PDF files, into file formats used by computer applications such as Microsoft Word, Excel, Adobe Acrobat, or HTML files.OmniPage is in competition with ExperVision (TypeReader), Readiris and ABBYY Fine Reader as well as free software such as GOCR and Tesseract.

http://code.google.com/p/tesseract-ocr
In computer software, Tesseract is a free optical character recognition engine. It was originally developed as proprietary software at Hewlett-Packard between 1985 until 1995. After ten years without any development taking place, Hewlett Packard and UNLV released it as open source in 2005. Tesseract is currently developed by Google and released under the Apache License, Version 2.0.

http://jmagick.wiki.sourceforge.net
JMagick is an open source Java interface of ImageMagick. It is implemented in the form of Java Native Interface (JNI) into the ImageMagick API. JMagick does not attempt to make the ImageMagick API object-oriented. It is merely a thin interface layer into the ImageMagick API. JMagick currently only implements a subset of ImageMagick APIs. Should you require unimplemented features in JMagick, please join the mailing list and make a request. JMagick has a LGPL (Lesser GNU Public License) license.

http://www.expervision.com
The award-winning TypeReader converts scanned documents into electronic files at speed of 8,000 pages per hour with maximum reliability. Desktop 7.0 offers added flexibility to handle color and grayscale images, with duplex scanning support to process documents in English, French, German, Italian, Portuguese, Spanish, Dutch, Danish, Swedish, Norwegian, Finnish, Polish, Hungarian and Polynesian. It employs an unparalleled recognition technology to support 2618 fonts. Users can choose to output to various formats including PDF, MS Word, Excel, Lotus 1-2-3, HTML, etc.

http://www.edocfile.com
Tiff to Text is designed to perform Optical Character Recognition (OCR) in a batch process. The program utilizes the OCR engine from Nuance (Owners of OMNI Page – formally ScanSoft) that is included with Microsoft Office Document Imaging (MODI).

http://www.simpleocr.com/OCR_Software_Guide.asp

Posted in software, Technology | Tagged: , , , | 6 Comments »

Top 10 reasons to use eclipse…

Posted by sureshkrishna on October 13, 2007

  1. Eclipse is Free: I have seen very few organizations that would support you with expensive IDEs. As a developer if i need to play with, i dont have too many choices. Eclipse is free. I can download without too many hassles. Of course there are some other (i dont want to mention the names) IDEs for free too. But they don’t score in some other aspects. I get so many features for free.
  2. Eclipse Community & Industry Support: When i want to explain my boss and customers about eclipse, it important to know who is behind this project. Till now i had very few arguments about the credibility of the project and the people behind it. Especially the way it got spread through out the developer community from USA, Europe, and Asia is great. Developers celebrated eclipse’s birthday in Hyderabad, India in a huge way (doesn’t it say …).
  3. API Documentation: As a developer and technical lead, every one is interested to have a good API documentation so that the learning cycle is less. You don’t spend too much time in digging into unclear documentation. However good is the software, i want to have a good API documentation and eclipse has it.
  4. Free plugins: Once i have the base platform, i would want to use supporting and new features. And yes, many of VERY useful features are free. I have used so many plugins like findbugs, checkstyle, subclipse, etc… but in the end for IT departments its so nice to have something free and USEFUL.
  5. Code Samples: Sometimes i better understand with the help of code rather than reading some documentation. Eclipse has great code samples for all top level projects. Whoever starts SWT, JFace would definitely get lots of samples form snippets and also from the articles. Thanks to the guys who have supported all these.
  6. Design Philosophy: Many colleagues and subordinates have learned good design practices and programming patterns from eclipse code. Thanks to the book written by Gamma and Beck. Clean plugin architecture and clean interfaces.
  7. Customization : With the base platform around, you can do whatever you want to do. Users have the every possibility to customize their plugins that way they want.
  8. Extensibility : How else would we have seen the Java IDE, C++ IDE, Cobol IDE, PHP IDE, RCP Applications, etc…
  9. Productization: Its so easy to do productization. Changing the icons, splash screens, custom messages etc… Seeing some applications, you would not even recognize that they are built out of eclipse.
  10. Cross-Platform: A recent plugin i developed for a RIA/Web2.0 client, works on Windows, Linux and MacOSX; with the same code base and a single build. This is so much of a relief for many of the organizations who wants to develop the applications for multiple platforms.

Posted in Eclipse, Eclipse Performance, News, Plug-ins, Plugin, RCP | Tagged: , , , | Leave a Comment »