TutorialsComPDF AI

ComPDF AI Technical Guide: Formats, Performance & Deployment

Evelyn Cross | Tue. 09 Jun. 2026

CONTENTS

Supported Input Formats & Resolution Requirements

Layout Analysis & Document Output Formats

Extraction Coverage: 40+ Pre-Configured Document Types

Hardware Performance & Latency Benchmarks

Accuracy & Extraction Modes

Enterprise Knowledge Base & RAG Architecture

Deployment Architecture & Hardware Requirements

ComPDF AI is an enterprise-grade Document AI and Intelligent Document Processing (IDP) solution designed for extracting structured information from complex documents at scale. Whether you are operating in regulated industries requiring strict data privacy or building enterprise automation workflows, this guide provides a complete technical overview of ComPDF AI’s capabilities, performance metrics, and deployment hardware requirements.

At a Glance

Ingest Office, image, and markup formats up to 4.19M pixels. Output JSON, Markdown, TXT, Excel, CSV with layout preserved.
40+ pre-configured document categories, 80+ OCR languages, fully offline deployment supported.
Hardware scales from single GPU ≥ 24GB VRAM to dual RTX 4090 or RTX 6000‑/‑L40 (combined). Modular licensing. 60‑day trial available.

1. Supported Input Formats & Resolution Requirements

To ensure comprehensive document ingestion, ComPDF AI supports a wide array of formats across office, image, and markup categories.

Office Documents: .docx, .xlsx, .pptx, .doc, .xls, .ppt, .pdf

Image Formats: .png, .jpg / .jpeg, .gif, .bmp, .tiff / .tif, .webp

Text & Markup: .csv, .txt, .rtf, .html / .htm, .mhtml / .mht

Resolution Constraints: To guarantee precise recognition and layout detection, input files must meet specific pixel requirements:

Minimum pixel size: 3,136 pixels
Maximum pixel size: 4,194,304 pixels

2. Layout Analysis & Document Output Formats

ComPDF AI outputs data in structured formats depending on the specific use case: Markdown, JSON, and TXT for Document Parsing; JSON, Excel, and CSV for Document Extraction.

Beyond traditional OCR, the platform excels in Layout Analysis:

The parsing engine detects over 30 element types, including standard and irregular tables, merged cells, embedded images, stamps (various shapes and colors), mathematical formulas, code blocks, headers, footers, and references.
JSON output includes precise positional metadata (bounding boxes) for every detected region.
Markdown output meticulously preserves reading order across pages and columns, maintaining the logical structure of the original file.
The OCR engine supports 80+ languages, including English, Chinese, Thai, Korean, and Japanese.

3. Extraction Coverage: 40+ Pre-Configured Document Types

Designed for downstream automation, the extraction module handles high‑volume processing with no inherent page limit and a default API timeout of 300 seconds per request.

Pre-configured categories (40+ types)

Financial & Transportation

VAT invoices (standard and roll‑type)
Electronic promissory notes
Bank receipts
Railway electronic tickets
Aviation e‑ticket itineraries

Business Operations

Purchase Orders (PO)
Sales Orders (SO)
Invoices
Air Waybills (AWB)
Bills of Lading (BOL)

Medical & Identity

Medical e‑billing receipts
Medical prescriptions
Medical test reports
Diagnosis certificates (Type B)
National ID cards

Vehicle Documents

Motor vehicle sales unified invoices (new cars)
Used car sales unified invoices

For customized needs, users can leverage Custom Extract to define specific fields or AI Extract for one‑click comprehensive structured data extraction with human‑in‑the‑loop validation.

4. Hardware Performance & Latency Benchmarks

Processing speed is fundamentally determined by GPU compute capabilities. Below are the reference throughput benchmarks for the platform:

GPU Model	Parsing Throughput	Extraction Throughput
NVIDIA L4	~120 tokens/sec	~25 tokens/sec
NVIDIA RTX 4090	~500 tokens/sec	~60 tokens/sec

A token consumption estimation tool is available for estimating processing costs based on document size.

5. Accuracy & Extraction Modes

Accuracy varies depending on document type, image quality, scanning conditions, and extraction configuration. ComPDF AI offers two extraction approaches to balance coverage and precision:

Custom Extract: Upload a document, define the specific fields needed (e.g., invoice amount, tax ID, supplier name), and retrieve structured data for exactly those fields.
AI Extract: One click automatically detects and extracts all key information, with built‑in human‑in‑the‑loop validation available for quality assurance.

Specific accuracy metrics can be discussed during the evaluation phase based on your document types and use case.

6. Enterprise Knowledge Base & RAG Architecture

ComPDF AI allows organizations to build private, intelligent Q&A systems directly from uploaded documents, which are automatically parsed with zero manual preprocessing.

Smart Chunking: Employs multiple chunking strategies optimized for general text, Q&A, resumes, tables, papers, and presentations.
LLM Integration: Freely configurable backends including Gemini, ChatGPT, Qwen, DeepSeek, and Llama.
Security: Features built‑in answer traceability, flexible role‑based permissions, and MCP protocol compliance for integration with enterprise AI Agents.

7. Deployment Architecture & Hardware Requirements

ComPDF AI supports REST API‑based Cloud deployment, standard On‑Premise integration, and Fully Offline / Air‑Gapped Deployment (no internet connection required).

Due to its modular licensing, you can purchase only the functions you need. Processing speed scales positively with GPU compute capacity, and hardware can be customized based on customer requirements.

The recommended baseline configuration for standalone deployment:

OS: Ubuntu 24.04 LTS Minimal (x86_64)
GPU: 1 GPU with ≥ 24GB VRAM
CPU: ≥ 8 cores
RAM: ≥ 64 GB
Storage: ≥ 1 TB (expandable for knowledge base scenarios based on actual data volume)

Reference Hardware Configurations

Component	Minimum Baseline	Extraction or Parsing	Extraction + Parsing
OS	Ubuntu 24.04 LTS Minimal (x86_64)	Ubuntu 22.04 / 24.04 LTS	Ubuntu 22.04 / 24.04 LTS
CPU	≥ 8 cores	AMD Ryzen 9 7950X / Intel Core i9‑14900K	AMD EPYC 7513 / Xeon Gold 6338
GPU	1 GPU, ≥ 24GB VRAM	RTX 4090 (24GB) × 1	RTX 4090 (24GB) × 2 / RTX 6000 (48GB) × 1 / L40 (48GB) × 1
RAM	≥ 64 GB	128 GB DDR4	256GB DDR4 / DDR5 ECC
Storage	≥ 1 TB	1 TB Enterprise NVMe SSD (e.g., Samsung PM9A3 / 970 PRO)	2 TB NVMe
Docker	—	≥ 24.0.0	≥ 24.0.0
Docker Compose	—	v2.26.1+	v2.26.1+
Network	—	Gigabit / 2.5GbE (10GbE recommended for distributed search)	Gigabit / 2.5GbE (10GbE recommended for distributed search)

* For knowledge base scenarios, expand storage based on actual data volume.

8. Technical FAQ

Output & Layout

Does ComPDF AI preserve the original document layout?

Layout Analysis performs region‑based segmentation. JSON output includes positional metadata (bounding boxes) for each detected region, enabling layout restoration. Markdown output preserves reading order and merges cross‑page and parallel text while retaining the original document’s logical structure.

Processing & Capacity

What is the average recognition speed on low‑to‑mid‑range embedded hardware?

Speed depends on GPU hardware and document complexity. The reference figures for NVIDIA L4 and RTX 4090 provide a baseline. On‑site testing is recommended for precise latency measurements in your specific environment.

Deployment, Hardware & Features

Can the all‑in‑one hardware specifications be flexibly selected?

A single GPU with ≥ 24GB VRAM is recommended for full deployment. Processing speed scales with GPU compute capacity, and hardware can be customized based on customer requirements.

Does the solution include built‑in image preprocessing?

Image preprocessing features such as automatic noise removal, skew correction, border cropping, glare removal, and overexposure adjustment are not available in the current version, but can be supported in future releases.

Does the all‑in‑one appliance include all ComPDF AI features?

The core ComPDF AI platform consists of three modules: AI Document Parsing, AI Document Extraction, and Enterprise Knowledge Base. Whether all features are included in an appliance deployment depends on the customer’s specific requirements and project scope. A standard feature list and customization scope can be provided once business needs are clarified.

What functions require additional customization beyond the standard feature set?

Customization requirements are determined on a case‑by‑case basis. ComPDF AI supports custom extraction schemas, configurable document categories, and industry‑specific model tuning. The specific scope of standard versus custom features will be defined during the requirements‑gathering phase.

Trial & POC

When can POC testing be provided? Is there a cloud version available for trial?

An online Demo Center is available immediately for hands‑on testing. Enterprise customers can apply for a 60‑day free trial license. POC testing or cloud trials can be arranged once specific customer needs are identified.

Final Note

ComPDF AI is designed as a flexible, enterprise‑grade Document AI platform that supports both high‑scale cloud workloads and fully offline deployments. Its architecture enables organizations to extract structured data from complex documents while maintaining deployment flexibility, security compliance, and performance scalability.

Try Demo Contact Sales

Windows Web Android iOS Mac Server React Native Flutter Electron

60-day Trial

Best Solutions for Automated Document Processing in 2026 Document Processing Solutions: How to Choose the Right Stack What's New in ComPDF AI（ComIDP）: From Intelligent Data Extraction to Enterprise Knowledge Bases