The client needed to recognize various slide images in order to create the index for internal staff to view.
The purpose of usage was for internal staff training, conducting conference, and so on. Most of the slides were screen captured images (the process was not for general screenshot but utilized specialized software to directly capture screen images). The reason the client needed to achieve this was they did not have the original slides. Therefore it was necessary to use OCR recognition to create an index for consequent processing.
OCR Functions & Problems
Recognize slide images, mainly recognize headline and text portion, and then imported directly into the database for searching, editing, indexing, etc.
- Various types of images : the layout, character background color and light during original image collection will affect image quality.
- How to locate headline region and recognition
- How to choose threshold linearization value for images with dark color
- Engine tuning for special type of images
Perform an analysis of various types of samples then classify based on different characteristics then constitute one uniform plan. By adding a customized module above RTK for specialized image pre-processing, RTK can then be called to identify. For special unrecognizable images, the system automatically filters it out for manual entry.