OCR Software and Solution for Label Processing

Food Package Component Recognition System

Capture nutritional information from food packages with a mobile camera. This system recognizes and extracts nutrition data from an image, such as serving sizes, fat and calorie information, daily values, etc. This information can then be processed through special algorithms and make suggestions to health-conscious users on choosing the appropriate foods.

OCR Functions & Problems

OCR technology is an important part of food package component recognition system, which is mainly used to recognize characters for nutrition data analysis. Below is a summary of the most common problems we encountered during recognition:

  • It was difficult to control image acquisition physical environment, such as light, camera performance and shooting angle etc. It was difficult to maintain consistent image quality, which further declined recognition results.
  • Although the content on food package was relatively uniform, the different formats and background colors increased identification and extraction difficulty.
  • Because the food package component recognition system was run in a mobile environment, the client required high software speed. OCR, as one of the most important  functional module of this food package component recognition system, had to provide  the clients recogniton results at a much faster rate.
  • In order to improve ther recogniton accuracy rate, a post-processing algorithm needed to be added. For example, the value of percentage can’t exceed 100%. Recognition result made errors could result in incorrect nutrition suggestions.

Our Solution

  • Based on the above problems, we  customized our standard RTK based on our client’s  real world samples, and fully met our client’s needs.
  • With specific aim of working with images from camera, we developed a pre-processing module (threshold) to improve accuracy rate.
  • Gearing the engine to understand the food package component’s content features, we classified the samples by data and  developed a post-processing module to improve extracting efficiency.
  • To improve recognition speed, we simplified the recognition algorithms to save processing time.

Print Label Recognition

The client is a printer manufacturer who has approximately 200 customers. These customers are wholesalers or manufacturers, which provide products for retailers. When providing products to retailers, these customers need to print a large number of barcodes to paste on products. But in practice there is always a problem in which labels have printing errors, and retailers have to return the goods. This requires suppliers to confirm the correctness of labels when printing.

Printer Proofing Module

The printer test module is as an optional plug-in installed on a label printer. This plug-in is composed of a scanner and test system. After the label information is imported to the printer, the printer prints labels according to the information. Test module scans label for OCR recognition, compares the recognition results with print information to see whether there are errors.

OCR Functions & Problems

  • OCR is responsible for recognizing text information and comparing it with printer information.
  • Need to embed OCR on corresponding hardware platform
  • Barcode printer prints large volume of labels, so it requires fast OCR recognition speed
  • Accuracy rate required is high so as to judge whether label is incorrectly printed

Medical Insurance Company UB92 form, CMS1500 form

One of our clients is a global leading provider in the health care industry, providing information and care management products and services. There are two main departments: the drug wholesale solutions sector (wholesale pharmaceuticals, health care, cosmetics, medical supplies and equipment) and the technical solutions sector (provides supply chain management for Pharma, internal medicine/external medicine, integrated distribution network, many hospitals and clinics). To reduce medical errors, such as providing incorrect prescriptions or drug interaction warnings, wrong dosages and expired drugs to patients, the FDA requires an identification of drug barcodes on the back label. It should include NDC, lot number and exp. date. For this request, the client intended to develop an OCR system (product) for doctors/pharmaceutical retailers.

OCR Functions & Problems

The problemsat hand included:

  • Complex background picture
  • Characters and background pictures were mixed
  • Different character colors
  • Characters were inclined and distorted
  • The recognized character regions were not fixed

Our Solution

Geared to address the above OCR problems, we developed a pre-processing module based on standard RTK resulting in an accuracy rate near 100%

  • Aiming to address the the different background and character colors, import image and then threshold
  • Aiming to address inclined characters and unfixed recognition region, we developed a rotation recognition function and recognized four directions, added a filer function to only export key information including lot number and expiration date.