Depending upon the document being scanned, it is sometimes not enough to simply pass a document through a scanner to enable automatic processing with OCR, OMR and other functions. Excessive colour or grey backgrounds, security background patterns, etc sometimes necessitate that these colours be replaced with others prior to the document image being passed through OCR or OMR interpretation. It is also sometimes useful to drop out certain colours to enable the testing of a portion of a document as being empty or not – e.g., testing for the presence of a rubber stamp or signature.
(Images from left to right: Quality before Post-Processing, Quality after Post-Processing)
Enhancing the scanned image
Prior to OCR, it is possible to pass the image through multiple image enhancement rules to digitally sharpen, despeckle or soften text. This processing can be applied to the whole document being scanned or just a portion of one page, as indicated by a box drawn around the area of interest while configuring the rule.
Colour replacement technology is used in scanning for two main reasons:
ØOCR-enabling. Security backgrounds and aesthetic shading of document forms can disrupt or totally prevent OCR processes. By using colour swap-out carefully, we can eliminate the background of text in areas of interest to dramatically enhance the reliability of OCR in that area.
ØContent presence testing. In order to test for the presence or otherwise of writing, signatures, stamps or other marks on a document, it is often necessary to either temporarily blank out lines, boxes and other content on forms, or search for the stamp or signature's colour in order for Scan2x to test for the presence of, for example, the colour blue on the document.
Adding text, barcodes to scanned documents
It is possible to add text and barcodes to the scanned versions of documents as they are being processed by Scan2x. This functionality can be used to endorse the document with data related to the user who scanned it, the terminal on which it was scanned, or even data derived from what was found in the document.
For example, an application form can be endorsed with the status of the applicant at time of scanning, with the “Status” data having been retrieved dynamically from a database or web service during the scanning process.