ORPALIS adds Key Value Pair Data Extractor in its OCR SDK

ORPALIS has announced the first implementation of a key-value pair data extractor in its OCR engine for intelligent document understanding and processing.

Key-value pair extraction is at the heart of Intelligent Document Processing systems.

About 90% of all documents used by any company or organization are not structured.
As a result, extracting information from invoices, contracts, forms, bank statements, or emails can be tedious. It is also difficult to index and reuse this information elsewhere.

A KVP engine automatically extracts meaningful information from unstructured and semi-structured documents.

Like the other OCR technologies developed in-house by the company (MICR, MRZ, OMR, contextual OCR, and more), the KVP extractor benefits from a hybrid approach that includes heuristics, mathematics, and ML capabilities.
The engine relies on adaptive layout understanding and the same underlying elements techniques as NLP technologies.

The KVP extractor engine automatically adapts to the document and searches for the right approach, making the best use of resources available.

This approach gives excellent results on the usual weaknesses of traditional OCR and pure Machine Learning engines, especially with:

  • Text recognition in documents with lots of noise,
  • Dotted lines,
  • Touching & broken characters,
  • Text on coloured background,
  • Underlined text,
  • Skewed text,
  • Text in graphics and tables.

In addition to Key and Value, the ORPALIS engine also provides Type (nature of the content) and Accuracy (confidence level).

The KVP extractor is available with the latest GdPicture.NET and DocuVieware SDKs download. More information on the GdPicture.NET website.