News: Google slips out open-source OCR engine

sdy284 - September 5, 2006 06:29AM in
c|net - "Google has announced that it "quietly released" a veteran optical character recognition engine as open source a few months ago. The engine, Tesseract, was developed between 1985 and 1995 by HP Labs to some acclaim but was filed away when the company pulled out of the optical character recognition (OCR) business. According to a recent Google Code Blog post by "Uber Tech Lead" Luc Vincent, a couple of Hewlett-Packard employees decided to dust it off as open-source software with the help of the Information Science Research Institute at the University of Nevada at Las Vegas, which in turn called on Google to help with debugging. "