Extract fields and structured information from documents (images) with optional OCR data using various AI algorithms.

Document Understander Block

The Document Understander block is designed for extracting fields from documents (images) with optional Optical Character Recognition (OCR) data. It uses various AI algorithms to understand document structure and extract specific information fields automatically.

Overview

The Document Understander combines document understanding, field extraction, and structured data processing to automatically identify and extract specific information from documents. It can handle various document types and extract structured data for further processing.

Configuration Options

Algorithm Selection

Select an operation from the Choose Algorithm dropdown:

Field Extraction: Extract specific fields from documents
Structured Data Extraction: Extract structured information (tables, forms)
Entity Recognition: Identify and extract named entities
Layout Analysis: Analyze document layout and structure
Multi-modal Understanding: Advanced document understanding with visual and textual analysis

Input Configuration

Document Input

Property: msg.payload.document_path
Type: string
Description: Path to the document image file
Supported formats: .png, .jpg, .jpeg, .pdf, .tiff

Field Definitions

Property: msg.payload.fields
Type: array
Description: List of fields to extract from the document

Example:

[
  {
    "name": "invoice_number",
    "type": "text",
    "description": "Invoice number"
  },
  {
    "name": "total_amount",
    "type": "currency",
    "description": "Total amount"
  }
]

OCR Data (Optional)

Property: msg.payload.ocr_data
Type: object
Description: Pre-extracted OCR text data
Format: JSON object with text and confidence scores

Processing Options

Extraction Confidence

Type: number
Range: 0.0 to 1.0
Default: 0.7
Description: Minimum confidence threshold for field extraction

Output Format

Type: string
Options: ["json", "structured", "key_value"]
Default: "json"
Description: Format of the extracted data

Include Metadata

Type: boolean
Default: true
Description: Include extraction metadata and confidence scores

Use Cases

Invoice Processing

Extract key fields from invoices:

Invoice image → Document Understander → Extracted fields → Database storage

Form Processing

Extract information from forms:

Form image → Document Understander → Form data → Validation and processing

Contract Analysis

Extract key terms and information:

Contract document → Document Understander → Contract terms → Legal review

Receipt Processing

Extract transaction details:

Receipt image → Document Understander → Transaction data → Expense tracking

Common Patterns

Basic Field Extraction

// Configuration
// Algorithm: Field Extraction
// Output Format: json

// Input message:
{
  "payload": {
    "document_path": "documents/invoice_001.pdf",
    "fields": [
      {
        "name": "invoice_number",
        "type": "text",
        "description": "Invoice number"
      },
      {
        "name": "total_amount",
        "type": "currency",
        "description": "Total amount"
      },
      {
        "name": "due_date",
        "type": "date",
        "description": "Payment due date"
      }
    ]
  }
}

// Example flow:
// inject → Document Understander → debug (extracted fields)

Structured Data Extraction

// Configuration
// Algorithm: Structured Data Extraction
// Include Metadata: true

// Input message:
{
  "payload": {
    "document_path": "forms/application.pdf",
    "fields": [
      {
        "name": "applicant_name",
        "type": "text",
        "description": "Full name of applicant"
      },
      {
        "name": "contact_info",
        "type": "object",
        "description": "Contact information object"
      }
    ]
  }
}

// Example flow:
// inject → Document Understander → debug (structured data)

// Configuration
// Algorithm: Multi-modal Understanding
// Output Format: structured

// Input message:
{
  "payload": {
    "document_path": "documents/contract.png",
    "fields": [
      {
        "name": "contract_type",
        "type": "text",
        "description": "Type of contract"
      },
      {
        "name": "parties",
        "type": "array",
        "description": "Contracting parties"
      }
    ],
    "ocr_data": {
      "text": "CONTRACT AGREEMENT...",
      "confidence": 0.95
    }
  }
}

// Example flow:
// OCR → Document Understander → debug (multi-modal extraction)

Advanced Features

Custom Field Types

Define custom field types for specific extraction needs:

{
  "fields": [
    {
      "name": "signature_present",
      "type": "boolean",
      "description": "Whether document contains a signature"
    },
    {
      "name": "table_data",
      "type": "table",
      "description": "Extract table data as structured object"
    },
    {
      "name": "contact_info",
      "type": "contact",
      "description": "Extract contact information object"
    }
  ]
}

Confidence Scoring

Detailed confidence analysis for each extracted field:

{
  "extracted_fields": {
    "invoice_number": {
      "value": "INV-2024-001",
      "confidence": 0.95,
      "location": {
        "page": 1,
        "coordinates": [100, 200, 300, 250]
      }
    },
    "total_amount": {
      "value": "$1,250.00",
      "confidence": 0.92,
      "location": {
        "page": 1,
        "coordinates": [400, 300, 500, 350]
      }
    }
  },
  "overall_confidence": 0.94
}

Layout Analysis

Understand document structure and layout:

{
  "document_analysis": {
    "layout_type": "invoice",
    "sections": [
      {
        "type": "header",
        "content": "Company information",
        "confidence": 0.98
      },
      {
        "type": "body",
        "content": "Invoice details",
        "confidence": 0.95
      },
      {
        "type": "footer",
        "content": "Payment information",
        "confidence": 0.92
      }
    ],
    "tables_detected": 1,
    "signatures_detected": 1
  }
}

Output Structure

Basic Field Extraction

{
  "document_path": "documents/invoice_001.pdf",
  "extracted_fields": {
    "invoice_number": "INV-2024-001",
    "total_amount": "$1,250.00",
    "due_date": "2024-02-15"
  },
  "extraction_metadata": {
    "algorithm_used": "Field Extraction",
    "processing_time": 3.2,
    "overall_confidence": 0.94,
    "timestamp": "2024-01-15T10:30:00Z"
  }
}

Detailed Extraction with Confidence

{
  "document_path": "forms/application.pdf",
  "extracted_fields": {
    "applicant_name": {
      "value": "John Smith",
      "confidence": 0.96,
      "type": "text"
    },
    "contact_info": {
      "value": {
        "email": "[email protected]",
        "phone": "+1-555-123-4567",
        "address": "123 Main St, City, State 12345"
      },
      "confidence": 0.89,
      "type": "contact"
    }
  },
  "extraction_metadata": {
    "algorithm_used": "Structured Data Extraction",
    "fields_extracted": 2,
    "fields_failed": 0,
    "processing_time": 4.1
  }
}

{
  "document_path": "contracts/agreement.pdf",
  "extracted_fields": {
    "contract_type": {
      "value": "Service Agreement",
      "confidence": 0.92,
      "sources": ["visual_analysis", "text_analysis"]
    },
    "parties": {
      "value": ["ABC Company", "XYZ Corporation"],
      "confidence": 0.88,
      "sources": ["text_analysis"]
    }
  },
  "document_analysis": {
    "layout_type": "legal_contract",
    "visual_elements": ["signatures", "company_logos"],
    "text_quality": "high"
  }
}