Skip to content
ComPDF

Overview

Layout analysis uses AI technology to parse and understand the structure of a document layout. It extracts text, images, tables, layers, and other data from input documents.

Features that support Layout Analysis:

  • PDF to Word
  • PDF to Excel
  • PDF to PowerPoint
  • PDF to HTML
  • PDF to RTF
  • PDF to TXT
  • PDF to CSV
  • Extract PDF to JSON
  • Extract PDF to Markdown

Notice

  • You need to load the DocumentAI model before using layout analysis, or plug in your own AI engine via callbacks described in 4.11 Use Custom AI Models via Callbacks.
  • When OCR is enabled, layout analysis is automatically enabled.
  • AI table recognition is a separate stage controlled by enable_ai_table_recognition.

Sample

c
CPDF_SetDocumentAIModel(CPDF_TEXT("path/documentai.model"), -1);

CConvertOption option = CPDF_DefaultConvertOption();
option.enable_ai_layout = true;

CPDF_StartPDFToWord(CPDF_TEXT("word.pdf"), CPDF_TEXT("password"), CPDF_TEXT("path/output.docx"), option, NULL);