Vega Informatica s.r.l.
The creativity
in the service of innovation
Italian - ItalyEnglish (United Kingdom)
Home Products DPA
Offerte di Lavoro
Share Point sviluppatore
There are no translations available.

 
Esperti CISCO
There are no translations available.

 

Document Processing Agent

Document Processing Agent (DPA) is a software tool for processing, classification, sorting and automatic archiving of large quantity of documents, which aims to exclude or lead to a minimum human intervention.

This is an "autonomous document processing agent", with the capability of artificial vision, semantic interpretation of images, extraction, understanding and validating the content and filing of the original document and metadata automatically extracted in a repository which allows the storage and consultation. The system was designed with the specific goal of being highly modular and therefore it is configurable in all its aspects: from the reading input and output writing systems to the integration with multiple applications and middleware already existing in the business environment.

 

Context
The tools currently used for document processing (OCRs and Document Management Systems) have been shown not to be efficient as the manual input because of the exceptions cancel any benefit generated by the technology.

There are, in fact, realities in which - even today - we have to manage, in a totally manual way, all the document processes necessary to carry out practices or services supply. For example, in the insurance world, the network of agencies around the country produces a highly asynchronous and uneven flow of documents (identity documents, completed forms, logbook, ...) to the headquarters where large numbers of employees aggregates the documents for the practice and report anomalies. In these contexts, the cost of practice management is particularly relevant and the work rather stressful.

DPA is the answer to these requirements: an intelligent system able to move from simple digitalization of the incoming document flow to a more evolved level, capable of exploiting all available information in the document and not only by the written text, but leading to convergence the approach based on artificial vision with the textual/semantic one.

System
The heart of DPA consists of Grafema, an application that coordinates the various external modules (ICR,  semantic engine, document management system ,...) to achieve the goal of automating the management of workflow. Grafema is an application that is placed above the OCR allowing to exploit the potential of reading and, at the same time, enriching the process.

Grafema implements algorithms and procedures that optimize the basic functions of reading, enabling you to extract from the document only the content of interest, without having to rigidly define the areas to read. For example, we can extract from a stream of heterogeneous documents vat numbers, National Insurance numbers, order numbers, etc.. without knowing, for each document, where the information requested is exactly positioned.

This is a highly modular and dynamic software, which can be easily reprogrammed (via scripting), and expanded in functionalities, thus giving the opportunity for system integrators and end users to implement specific functions required by vertical contexts.

This feature allows you to integrate, after the reading process, a semantic system that provides tagging of read content and carries to a possible automatic classification of incoming documents. Once the documents have been processed and data and metadata have been acquired, Grafema allows you to "save" this information in any format and on any system that exposes APIs. For example, we have integrated Grafema with one of the most famous document management systems: Alfresco.

Grafema can do much more: it can, for example, integrate with workflow systems (BPMS) to engage in business processes.

Configuration
As we have seen, DPA is a suite of components fully integrated with each other. However, they can also be used separately. The configuration of the whole system can be easily accomplished by editing simple text files.

OCR Kernel
The kernel of the reading system consists of an ICR developed by Vega that has good performance and possibility of low-level customizations for specific applications.

However, it can easily be replaced by any other OCR system that exposes APIs for reading function calls.

Input, output and integration with external systems

Grafema can write the read and processed data on any file format, save it to DB, send content via email, pass the information in ERP systems or accounting programs, etc.. As with the output, Grafema can read the documents to be processed by various sources: a spool directory, fax server, FTP server, ...

Again, only the imagination is a limit to Grafema!

Technical features

  • Suite of integrated applications based primarily on Java technology
  • OCR kernel written in C++ for Windows. Expected porting to Linux
  • Available integration with Alfresco (www.alfresco.it) as a document management system
  • Input reading from a variety of sources (directories, fax server, FTP ,...)
  • Possibility of simultaneous writing on multiple resources (files, databases, applications, SMTP ,...)