Latest News
28/Mar/2011
We are looking for you to become the key Allgeier distributor in your ...
21/Jan/2011
Cyprus, January 20, 2011 – Allgeier announced today the establishment of ...
01/Jul/2010
The Medical Viewer Silverlight has been installed at different sites now, and ...
28/Jun/2010
With Optical Recognition, Document Classification and Data Extraction on board, ...
24/Jun/2010
After two months of intensive collaboration, UZ Ghent decided to use the ...
18/Mar/2010
SUMMARY The VAPH, an official department of the Flemish Administration, has ...
Document Processing Trinity
Document Processing Trinity
Recently, Allgeier DMS installed SOCRATES® (Swift Optical Character Recognition And Text Embedding System), extracting character information from over 200.000 scanned pages a day.
Now the time has come for Allgeier DMS to take the next step in Document Processing. After years of development and careful investigation of its reliability, Allgeier DMS now invested in two extra technologies involved in Document Processing: Document Classification and Data Extraction.
Document Classification is the process of recognizing types of documents, and classifying them for further processing.
With a capacity of 100.000+ pages per day, the Document Processing built into our SRP (see below) is intended for high-volume automated classification of any kind of documents. Suppose you have scanned a pile of documents including 50 types of documents which you would like to distinguish; you can then learn the Document Classification to recognize these document types. With its built-in intelligence, our Document Classification services are supremely intelligent, accurate and scalable.
By recognizing document types, scanned incoming mail can be routed to the correct divisions or they can be routed to other components of the SRP (see below) for further processing. Invoices are routed to the accountancy department, job applications to HR, collateral to your sales department, official documents to your legal department etc.
Document Classification can easily recognize:
• Recurring fixed forms (questionnaires, surveys, tests, tax returns, application forms…)
• semi-structured documents (invoices, purchase orders, benefit forms, and receipts);
• unstructured documents (contracts, letters, articles, ).
Our Document Classification Engine can even correctly split Multi-page documents. It handles documents with variable number of pages, Documents with multi-page tables and documents with image or text attachments.
Document Classification often is the first step before Data Extraction. After being recognized, verified and classified, documents can be automatically transferred to our Data Extraction, to provide reliable, accurate, searchable and highly structured electronic data critical to business processes.
It provides a single entry point to automatically transform different forms and documents of any structure and complexity to usable and accessible data ready to be exported into your business applications and databases.
Allgeier Data Extraction Services accurately extracts data and text from the fields specific for each document type. It applies Optical Recognition, including ICR (for hand-printed text), IMR (Checkmark recognition) and Barcode recognition. With a predefined set of validation rules, data correctness is ensured. Data can be automatically checked against a database, checked to satisfy the format, brought to a standard (normalized) and more. Any custom rule can be added using scripting language.
Scanning Resource Platform
With these three technologies on board, the Trinity of Document Processing has now become a reality.
All of them are now built into the Scanning Resource Platform, a rugged backbone of the ScanFactory.
The Scanning Resource Platform (SRP) is a technological backbone developed by Allgeier DMS. It is a modular system handling the daily processing of 600.000 images (the result of scanning 150.000 sheets a day, both sides and all of them in B/W and color). Next to that, the SRP handles the full traceability of all files and documents involved in the scanning process, including traceability of persons and actions. Its fully automated modular structure contains rugged building blocks for image processing, blank page detection, choosing between B/W and color images, batch processing, Optical Character Recognition, grouping images to files or subfiles, exporting them to any popular format and creating the required index files. Now Document Classification and Data Extraction have been added, completing the Document Processing Trinity.







