What is ABBYY FlexiCapture?

ABBYY FlexiCapture is an Intelligent Document Processing platform built for the needs of today’s complex digital enterprise. FlexiCapture brings together the best NLP, machine learning, and advanced recognition capabilities into a single, enterprise-scale platform to handle every type of document, from simple forms to complex free-form documents, and every job size, from ad hoc single documents to large batch jobs requiring tough SLAs. Orchestrating the process from acquisition to delivery, FlexiCapture feeds content-driven business applications such as RPA and BPM, helping organizations focus on customer service, cost reduction, compliance, and competitive advantage.


Enterprise automation starts with a comprehensive platform for acquiring, processing, validating, and delivering the right data into critical processes.

  • Faster, straight-through processing: Content from documents entering through any channel, in any format, is automatically extracted, understood, and delivered, removing manual processing friction.
  • Smooth transactions, smart decisions, rapid action: Leverage customer-provided data to accelerate transactions, make smarter decisions, and provide quick, accurate responses to your customers.
  • Control, predictability, and compliance: Gain full chain of custody reporting and management for fine-tuning of results, while ensuring end-to-end compliance with your process and security models.
  • Intelligent data extraction: By leveraging Natural Language Processing (NLP) technology, you can now automate the identification and extraction of data from unstructured, complex documents likes contracts and emails, along with structured and semi-structured documents, helping to accelerate transactions while significantly reducing operating costs and errors.
  • Data validation and control: Critical data fields, context, and entities are identified, validated, and automatically processed according to business rules and requirements. The system can be easily trained and uses ongoing machine learning for continuous improvements and cost control.

How it works

ABBYY FlexiCapture is a highly accurate and scalable document workflow platform that intelligently captures, classifies and transfers critical data from unstructured and structured documents to the right process, workflow or decision engine.

1. Automated document entry

ABBYY FlexiCapture automatically processes all types of documents from files and scanners in a single flow, including office documents and image formats, email attachments and message bodies.

Digitally born office documents can be processed with:

  • Microsoft Office if it is installed, its use is allowed in the settings, and a valid login and password have been provided.
  • LibreOffice if it is installed and its use is allowed in the settings.
  • The built-in converter if none of the above can be used.

2. Automatic document classification

The neural-based automatic document classification technology enables sorting of documents by types (e.g., driver license, bank statement, tax form, contract, invoice, etc.) and custom subcategories (e.g., invoices from vendor A, invoices from vendor B, etc.) by text content and image patterns.
It learns quickly and easily, enabling it to perform as an auto-classifier – just provide a set of sample documents (no fewer than 10 documents of each type) and specify reference classes for each document in the set. Not only does it define a document type, but also selects a correct document definition for further content processing.

For many real-life scenarios, the precision/recall ratio can be adjusted easily: simply prioritize either recall or precision, or use the “balanced” mode.

3. Recognition

At the recognition stage, document images are assembled into multi-page documents or document sets. Their content and data are intelligently extracted and validated automatically in an unattended mode.

Automatic assembly: multi-page documents out of pages can be done either by separators (e.g. blank pages inserted between the two documents), page counters, or with the help of ABBYY neural-based classification algorithms that automatically identify.

ABBYY FlexiCapture runs consistency checks to ensure all case-related documents are assembled correctly into a full document set.

Highly accurate OCR/ICR/OMR and barcode recognition incorporating optical character recognition of printed text in up to 200 languages, including CJK; intelligent character recognition for hand-printed text in over 130 languages; barcode recognition for a variety of 1D and 2D barcodes; and optical mark recognition for a wide range of checkmarks.

Automatic validation includes comparison against databases, conformity with built-in validation rules, compliance with formats, data normalization and user-defined checks.

4. Data extraction

ABBYY FlexiCapture automatically extracts data from a variety of paper or digital-born document types, structured and unstructured, such as mortgage applications, tax returns, questionnaires, credit card applications, contracts, invoices, customer emails and many more.

5. Verification

Verification station allows checking if extracted fields match those of the original document. Alternatively, verification can be started manually using the web-based verification station, easily accessible to a verification operator from any physical location.

Any of the following techniques can be used:

  • Group verification
  • Verification in Document Window
  • Field verification

6. Data export

ABBYY FlexiCapture automatically exports recognized data to different file formats, or to databases, systems of record and other destination points in line with user-defined rules:

  • Corporate file storage repositories – SharePoint, Laserfiche, etc.
  • ODBC compatible databases – Oracle, Microsoft SQL Server, and Microsoft Access.
  • ERP and ECM systems – SAP, Microsoft, IBM, Sage, MYOB, Acumatica and others.
  • Smooth integration with RPA workflows to delivery data in legacy systems.
  • Document set images can be exported to one PDF file or placed in a storage location. A file or database record should describe the structure of the document set and contain a link to each document image.
  • Document set fields (including fields in child documents) can be exported to ODBS databases and files. All fields in child documents are available when setting up an export; you can set up mapping and redact sensitive information both in a document section and in linked documents.

7. Web-based administration and monitoring console

FlexiCapture HTML5 administration and monitoring station enables 24/7 supervision from any physical location. It provides multi-level administration, automatic notifications for critical failures, and comprehensive reporting.

Three standard types of reporting are available: site productivity, processing productivity, and general operator report. The reports can be generated as PDF file or CSF for further analysis.



With FlexiCapture capabilities, it is easy to create applications that meet the requirements of specific internal or outsourced business scenarios. FlexiCapture customization scripts and web service API enable companies to tailor processing stages and data routing to suit your specific needs.

Flexible workflow customization capabilities

  • Scanning and classification: FlexiCapture offers scripting capabilities for customizing scanning and classification stages. It is vital for some projects to have additional tools for scanning and classification in order to perform special actions or follow regulations.
  • Recognition and extraction: This script enables a third-party OCR/ICR engine to recognize any region of a field in the document during the customized recognition stage. Recognition stage includes assembly of documents, document sets, text, and data extraction can be adjusted to any custom scenario.
  • Auto-correction and data validation: The auto-correction script is launched automatically after recognition to auto-replace or modify data in recognized fields. Data validation scripts can be used to create rules to define custom algorithms for data validation and normalization (e.g. dictionaries, entire collection or just a custom set of symbols).
  • Document image enhancement: In contrast to the standard assembly rules in document definition properties, a custom script provides the flexibility in document image enhancements by assembling documents into document sets based on user-defined rules.
  • Verification: For the customized verification process, the scripts add controls over document-specific functions, change the software’s behavior for a particular project, or launch automatically for some events that occur when a batch, document or field is processed.
  • Export rules: All processed data can be exported to different formats for further data utilization. Create custom export modules with a scriptable export to deliver data and images directly to external applications, including ECM, CRM and ERP systems.
  • Web Service API: Web service API makes it easy to develop custom applications or import modules that will deliver documents directly to FlexiCapture for indexing, classification and data extraction. Data captured by external applications arrive to FlexiCapture processing server over HTTP or HTTPS protocols. The scripts enable embedding of FlexiCapture web-stations into any back-end system and applying custom scenarios, stages, user roles and design (buttons, menus, and toolbars).