Optical Character Recognition (OCR) is a technology that functions much like a printer in reverse. An OCR system reads printed text and converts it to an electronic format for use in document processing applications. There is a wide variety of OCR systems in use today, from the massive document handling computers used by post offices, to the desktop systems that employ scanners for reading text into word processing and spreadsheet applications.
While they often differ in the combination of technologies employed, all OCR systems have several things in common. They use some form of bitmapped image as an input, whether drawn from a printed document, magnetic tape; or image file. They also employ one or more algorithms (rules or procedures used to solve problems) to translate combinations of dots in a bitmap into a recognized character. Finally, all OCR systems output recognized characters in some kind of computer usable medium, including but not limited to punch cards, electronic data (i.e., point-of-sale scanners in grocery stores) and formatted text.
Next >> |