ComPDF AI is an enterprise-grade Document AI and Intelligent Document Processing (IDP) solution designed for extracting structured information from complex documents at scale. Whether you are operating in regulated industries requiring strict data privacy or building enterprise automation workflows, this guide provides a complete technical overview of ComPDF AI’s capabilities, performance metrics, and deployment hardware requirements.
At a Glance
- Ingest Office, image, and markup formats up to 4.19M pixels. Output JSON, Markdown, TXT, Excel, CSV with layout preserved.
- 40+ pre-configured document categories, 80+ OCR languages, fully offline deployment supported.
- Hardware scales from 4‑core‑/‑12GB‑VRAM to 16‑core‑/‑24GB‑VRAM. Modular licensing. 60‑day trial available.
1. Supported Input Formats & Resolution Requirements
To ensure comprehensive document ingestion, ComPDF AI supports a wide array of formats across office, image, and markup categories.
Office Documents: .docx, .xlsx, .pptx, .doc, .xls, .ppt, .pdf
Image Formats: .png, .jpg / .jpeg, .gif, .bmp, .tiff / .tif, .webp
Text & Markup: .csv, .txt, .rtf, .html / .htm, .mhtml / .mht
Resolution Constraints: To guarantee precise recognition and layout detection, input files must meet specific pixel requirements:
2. Layout Analysis & Document Output Formats
ComPDF AI outputs data in structured formats depending on the specific use case: Markdown, JSON, and TXT for Document Parsing; JSON, Excel, and CSV for Document Extraction.
Beyond traditional OCR, the platform excels in Layout Analysis:
- The parsing engine detects over 30 element types, including standard and irregular tables, merged cells, embedded images, stamps (various shapes and colors), mathematical formulas, code blocks, headers, footers, and references.
- JSON output includes precise positional metadata (bounding boxes) for every detected region.
- Markdown output meticulously preserves reading order across pages and columns, maintaining the logical structure of the original file.
- The OCR engine supports 80+ languages, including English, Chinese, Thai, Korean, and Japanese.
3. Extraction Coverage: 40+ Pre-Configured Document Types
Designed for downstream automation, the extraction module handles high‑volume processing with no inherent page limit and a default API timeout of 300 seconds per request.
Pre-configured categories (40+ types)
Financial & Transportation
- VAT invoices (standard and roll‑type)
- Electronic promissory notes
- Bank receipts
- Railway electronic tickets
- Aviation e‑ticket itineraries
Business Operations
- Purchase Orders (PO)
- Sales Orders (SO)
- Invoices
- Air Waybills (AWB)
- Bills of Lading (BOL)
Medical & Identity
- Medical e‑billing receipts
- Medical prescriptions
- Medical test reports
- Diagnosis certificates (Type B)
- National ID cards
Vehicle Documents
- Motor vehicle sales unified invoices (new cars)
- Used car sales unified invoices
For customized needs, users can leverage Custom Extract to define specific fields or AI Extract for one‑click comprehensive structured data extraction with human‑in‑the‑loop validation.
4. Hardware Performance & Latency Benchmarks
Processing speed is fundamentally determined by GPU compute capabilities. Below are the reference throughput benchmarks for the platform:
| Hardware | Parsing Throughput | Extraction Throughput |
|---|---|---|
| NVIDIA L4 | ~120 tokens/sec | ~25 tokens/sec |
| NVIDIA RTX 4090 | ~500 tokens/sec | ~60 tokens/sec |
A token consumption estimation tool is available for estimating processing costs based on document size.
5. Accuracy & Extraction Modes
Accuracy varies depending on document type, image quality, scanning conditions, and extraction configuration. ComPDF AI offers two extraction approaches to balance coverage and precision:
- Custom Extract: Upload a document, define the specific fields needed (e.g., invoice amount, tax ID, supplier name), and retrieve structured data for exactly those fields.
- AI Extract: One click automatically detects and extracts all key information, with built‑in human‑in‑the‑loop validation available for quality assurance.
Specific accuracy metrics can be discussed during the evaluation phase based on your document types and use case.
6. Enterprise Knowledge Base & RAG Architecture
ComPDF AI allows organizations to build private, intelligent Q&A systems directly from uploaded documents, which are automatically parsed with zero manual preprocessing.
- Smart Chunking: Employs multiple chunking strategies optimized for general text, Q&A, resumes, tables, papers, and presentations.
- LLM Integration: Freely configurable backends including Gemini, ChatGPT, Qwen, DeepSeek, and Llama.
- Security: Features built‑in answer traceability, flexible role‑based permissions, and MCP protocol compliance for integration with enterprise AI Agents.
7. Deployment Architecture & Hardware Requirements
ComPDF AI supports REST API‑based Cloud deployment, standard On‑Premise integration, and Fully Offline / Air‑Gapped Deployment (no internet connection required).
Due to its modular licensing, you can purchase only the functions you need. Below are the minimum hardware requirements based on your deployment combination:
Single-Function Deployment
| Function | CPU | GPU | RAM | Storage |
|---|---|---|---|---|
| Knowledge Base only | ≥ 4 cores | ≥ 12GB VRAM | ≥ 16 GB | ≥ 100 GB SSD |
| Extraction only | ≥ 8 cores | ≥ 16GB VRAM* | ≥ 32 GB | ≥ 500 GB NVMe SSD |
| Parsing only | ≥ 16 cores | Not required | ≥ 64 GB | ≥ 1 TB NVMe SSD |
* RTX 3090 / 4080 class recommended
Combined Deployment
| Functions | CPU | GPU | RAM | Storage |
|---|---|---|---|---|
| KB + Extract | ≥ 8 cores | ≥ 16GB VRAM | ≥ 48 GB | ≥ 500 GB NVMe SSD |
| KB + Parse | ≥ 8 cores | ≥ 12GB VRAM | ≥ 96 GB | ≥ 500 GB NVMe SSD |
| Extract + Parse | ≥ 16 cores | ≥ 24GB VRAM | ≥ 96 GB | ≥ 1 TB NVMe SSD |
| All three | ≥ 16 cores | ≥ 24GB VRAM | ≥ 128 GB | ≥ 1 TB NVMe SSD |
Supported OS: Ubuntu, Fedora, Debian, or CentOS. Currently only 64‑bit Intel (x86_64) processors are supported. Regardless of OS, at least 85GB of total memory is required for the full deployment.
See the full system requirements documentation for additional details.
8. Technical FAQ
Output & Layout
Does ComPDF AI preserve the original document layout?
Layout Analysis performs region‑based segmentation. JSON output includes positional metadata (bounding boxes) for each detected region, enabling layout restoration. Markdown output preserves reading order and merges cross‑page and parallel text while retaining the original document’s logical structure.
Processing & Capacity
What is the average recognition speed on low‑to‑mid‑range embedded hardware?
Speed depends on GPU hardware and document complexity. The reference figures for NVIDIA L4 and RTX 4090 provide a baseline. On‑site testing is recommended for precise latency measurements in your specific environment.
Deployment, Hardware & Features
Can the all‑in‑one hardware specifications be flexibly selected?
A single GPU with ≥ 24GB VRAM is recommended for full deployment. Processing speed scales with GPU compute capacity, and hardware can be customized based on customer requirements.
Does the solution include built‑in image preprocessing?
Image distortion correction (trim correction) and image enhancement are currently available via API. Additional features — such as automatic noise removal, auto deskewing, automatic border cropping, glare removal, and overexposure adjustment — are planned for future releases.
Does the all‑in‑one appliance include all ComPDF AI features?
The core ComPDF AI platform consists of three modules: AI Document Parsing, AI Document Extraction, and Enterprise Knowledge Base. Whether all features are included in an appliance deployment depends on the customer’s specific requirements and project scope. A standard feature list and customization scope can be provided once business needs are clarified.
What functions require additional customization beyond the standard feature set?
Customization requirements are determined on a case‑by‑case basis. ComPDF AI supports custom extraction schemas, configurable document categories, and industry‑specific model tuning. The specific scope of standard versus custom features will be defined during the requirements‑gathering phase.
Trial & POC
When can POC testing be provided? Is there a cloud version available for trial?
An online Demo Center is available immediately for hands‑on testing. Enterprise customers can apply for a 60‑day free trial license. POC testing or cloud trials can be arranged once specific customer needs are identified.
Final Note
ComPDF AI is designed as a flexible, enterprise‑grade Document AI platform that supports both high‑scale cloud workloads and fully offline deployments. Its architecture enables organizations to extract structured data from complex documents while maintaining deployment flexibility, security compliance, and performance scalability.