Overview
Layout analysis uses AI technology to parse and understand the structure of a document layout. It extracts text, images, tables, layers, and other data from input documents.
Features that support Layout Analysis:
- PDF to Word
- PDF to Excel
- PDF to PowerPoint
- PDF to HTML
- PDF to RTF
- PDF to TXT
- PDF to CSV
- Extract PDF to JSON
- Extract PDF to Markdown
Notice
- You need to load the DocumentAI model before using layout analysis, or plug in your own AI engine via callbacks described in 4.11 Use Custom AI Models via Callbacks.
- When OCR is enabled, layout analysis is automatically enabled.
- AI table recognition is a separate stage controlled by
enable_ai_table_recognition.
Sample
c
CPDF_SetDocumentAIModel(CPDF_TEXT("path/documentai.model"), -1);
CConvertOption option = CPDF_DefaultConvertOption();
option.enable_ai_layout = true;
CPDF_StartPDFToWord(CPDF_TEXT("word.pdf"), CPDF_TEXT("password"), CPDF_TEXT("path/output.docx"), option, NULL);