[kwlug-disc] OCR to web

Insurance Squared Inc. gcooke at insurancesquared.com
Thu Jan 6 16:51:22 EST 2011

As some of you know, I scan out of copyright books and publish them on 
the web.  I've struggled with this process for years.  I'd like your 
input on the following:
- any knowledge of decent linux OCR with gui that will let me OCR say a  
500 page book?
- let's say I've got the book(s) ocr'ed.  So I've got 100's or thousands 
of .txt files and the same number of image files.
     - how do I get those into a useable web platform i.e. get them into 
a cms?
     - what cms suits this type of application?

Any thoughts appreciated.  I'm hoping to put online another couple 
projects shortly and don't want to use my old platform.


More information about the kwlug-disc mailing list