Skip to content
ComPDF

Image Conversion

To convert a Image file to an Office or other format, send a request to /file/handle, including the Image file as input and file processing parameters. Before you begin, make sure ComPDFKit Processor is started and running.

You will send a POST request to the endpoint /file/handle of the processor. For more information about multipart requests, please refer to the API section.

Convert using local PDF file

Send segmented requests to /file/handle and attach the Image file:

shell
curl -f -X POST http://localhost:7000/file/handle \
-H "Content-Type: multipart/form-data" \
-F file=@"image.png" \
-F executeType="img/docx" \
-F password="file open password" \
-F parameter="{ \"contentOptions\": \"2\", \"worksheetOptions\": \"1\"}" \
> result.docx

Image Conversion Parameters

This section describes the parameter settings currently supported by ComPDFKit Processor for Image file conversion and processing.

Image to Word

Note: Special parameters can be used when uploading files for different functions, while the remaining steps remain consistent.

Image to Word:

java
{
  "enableAiLayout": 1,
  "isContainImg": 1,
  "isContainAnnot": 1,
  "enableOcr": 0,
  "ocrRecognitionLang": "AUTO",
  "pageLayoutMode": "e_Flow",
  "formulaToImage": 1,
  "ocrOption": "ALL",
  "containPageBackgroundImage": 1
}

Required parameters

enableAiLayout: Whether to enable AI layout analysis (0: not enabled; 1: enabled). Default 1.

isContainImg: Whether to include images during conversion (0: not enabled; 1: enabled). Default 1.

isContainAnnot: Whether to include annotations during conversion (0: not enabled; 1: enabled). Default 1.

enableOcr: Whether to use OCR (0: Disable; 1: Enable). Default is 0.

ocrRecognitionLang: OCR recognition language,supported types and definitions: AUTO: Automatic, CHINESE: Simplified Chinese, CHINESE_TRAD: Traditional Chinese, ENGLISH: English, KOREAN: Korean, JAPANESE: Japanese, LATIN: Latin, DEVANAGARI: Devanagari, CYRILLIC: Cyrillic, ARABIC: Arabic, TAMIL: Tamil, TELUGU: Telugu, KANNADA: Kannada, THAI: Thai, GREEK: Greek, ESLAV: Slavic languages. Default is AUTO.

pageLayoutMode: Specify the layout mode. e_Box; e_Flow. Default is e_Flow.

formulaToImage: Whether to convert formulas to images (0: not enabled; 1: enabled). Default 0.

ocrOption: OCR recognition range, supported types and definitions:

INVALID_CHARACTER: Recognize illegal characters in PDF documents. SCAN_PAGE: Recognize scanned pages in PDF documents. INVALID_CHARACTERAND_SCAN_PAGE: Recognize illegal characters and scanned pages in PDF documents. ALL: Recognize all characters on all pages. Default: ALL.

containPageBackgroundImage:Whether to include page background images during conversion; this setting is only effective when using OCR (0: disabled; 1: enabled). Default 1.

Image to Excel

Note: Different parameters can be used when uploading files for different functions. The rest of the steps remain the same.

Image to Excel:

java
{
  "enableAiLayout": 1,
  "isContainImg": 1,
  "isContainAnnot": 1,
  "enableOcr": 0,
  "ocrRecognitionLang": "AUTO",
  "excelAllContent": 1,
  "excelWorksheetOption": "e_ForTable",
  "ocrOption": "ALL"
}

Required parameters

enableAiLayout: Whether to enable AI layout analysis (0: not enabled; 1: enabled). Default 1.

isContainImg: Whether to include images during conversion (0: not enabled; 1: enabled). Default 1.

isContainAnnot: Whether to include annotations during conversion (0: not enabled; 1: enabled). Default 1.

enableOcr: Whether to use OCR (0: Disable; 1: Enable). Default is 0.

ocrRecognitionLang: OCR recognition language,supported types and definitions: AUTO: Automatic, CHINESE: Simplified Chinese, CHINESE_TRAD: Traditional Chinese, ENGLISH: English, KOREAN: Korean, JAPANESE: Japanese, LATIN: Latin, DEVANAGARI: Devanagari, CYRILLIC: Cyrillic, ARABIC: Arabic, TAMIL: Tamil, TELUGU: Telugu, KANNADA: Kannada, THAI: Thai, GREEK: Greek, ESLAV: Slavic languages. Default is AUTO.

excelAllContent: Whether to convert all contents. 1: Yes; 0: No. Default is 1.

excelWorksheetOption: brief Excel Worksheet option. e_ForTable: A worksheet to contain only one table.; e_ForPage: A worksheet to contain table for PDF Page; e_ForDocument: A worksheet to contain table for PDF Document. Default e_ForTable.

ocrOption: OCR recognition range, supported types and definitions:

INVALID_CHARACTER: Recognize illegal characters in PDF documents. SCAN_PAGE: Recognize scanned pages in PDF documents. INVALID_CHARACTERAND_SCAN_PAGE: Recognize illegal characters and scanned pages in PDF documents. ALL: Recognize all characters on all pages. Default: ALL.

Image to Slide

Note: Different parameters can be used when uploading files for different functions. The rest of the steps remain the same.

Image to Slide:

java
{
  "enableAiLayout": 1,
  "isContainImg": 1,
  "isContainAnnot": 1,
  "enableOcr": 0,
  "ocrRecognitionLang": "AUTO",
  "ocrOption": "ALL",
  "containPageBackgroundImage": 1
}

Required parameters

enableAiLayout: Whether to enable AI layout analysis (0: not enabled; 1: enabled). Default 1.

isContainImg: Whether to include images during conversion (0: not enabled; 1: enabled). Default 1.

isContainAnnot: Whether to include annotations during conversion (0: not enabled; 1: enabled). Default 1.

enableOcr: Whether to use OCR (0: disable; 1: enable). Default is 0.

ocrRecognitionLang: OCR recognition language,supported types and definitions: AUTO: Automatic, CHINESE: Simplified Chinese, CHINESE_TRAD: Traditional Chinese, ENGLISH: English, KOREAN: Korean, JAPANESE: Japanese, LATIN: Latin, DEVANAGARI: Devanagari, CYRILLIC: Cyrillic, ARABIC: Arabic, TAMIL: Tamil, TELUGU: Telugu, KANNADA: Kannada, THAI: Thai, GREEK: Greek, ESLAV: Slavic languages. Default is AUTO.

ocrOption: OCR recognition range, supported types and definitions:

INVALID_CHARACTER: Recognize illegal characters in PDF documents. SCAN_PAGE: Recognize scanned pages in PDF documents. INVALID_CHARACTERAND_SCAN_PAGE: Recognize illegal characters and scanned pages in PDF documents. ALL: Recognize all characters on all pages. Default: ALL.

containPageBackgroundImage:Whether to include page background images during conversion; this setting is only effective when using OCR (0: disabled; 1: enabled). Default 1.

Image to HTML

Note: Different parameters can be used when uploading files for different functions. The rest of the steps remain the same.

Image to HTML:

java
{
  "enableAiLayout": 1,
  "isContainImg": 1,
  "isContainAnnot": 1,
  "enableOcr": 0,
  "ocrRecognitionLang": "AUTO",
  "pageLayoutMode": "e_Flow",
  "htmlOption": "e_SinglePage",
  "ocrOption": "ALL",
  "containPageBackgroundImage": 1
}

Required parameters

enableAiLayout: Whether to enable AI layout analysis (0: not enabled; 1: enabled). Default 1.

isContainImg: Whether to include images during conversion (0: not enabled; 1: enabled). Default 1.

isContainAnnot: Whether to include annotations during conversion (0: not enabled; 1: enabled). Default 1.

enableOcr: Whether to use OCR (0: not enabled; 1: enabled). Default is 0.

ocrRecognitionLang: OCR recognition language,supported types and definitions: AUTO: Automatic, CHINESE: Simplified Chinese, CHINESE_TRAD: Traditional Chinese, ENGLISH: English, KOREAN: Korean, JAPANESE: Japanese, LATIN: Latin, DEVANAGARI: Devanagari, CYRILLIC: Cyrillic, ARABIC: Arabic, TAMIL: Tamil, TELUGU: Telugu, KANNADA: Kannada, THAI: Thai, GREEK: Greek, ESLAV: Slavic languages. Default is AUTO.

pageLayoutMode: Specify the layout mode. e_Box; e_Flow. Default is e_Flow.

htmlOption: brief Html option. e_SinglePage: Convert the entire PDF file into a single HTML file.; e_SinglePageWithBookmark: Convert the PDF file into a single HTML file with an outline for navigation at the beginning of the HTML page.; e_MultiPage: Convert the PDF file into multiple HTML files.; e_MultiPageWithBookmark: Convert the PDF file into multiple HTML files. Each HTML file corresponds to a PDF page, and users can navigate to the next HTML file via a link at the bottom of the HTML page. Default is e_SinglePage.

ocrOption: OCR recognition range, supported types and definitions:

INVALID_CHARACTER: Recognize illegal characters in PDF documents. SCAN_PAGE: Recognize scanned pages in PDF documents. INVALID_CHARACTERAND_SCAN_PAGE: Recognize illegal characters and scanned pages in PDF documents. ALL: Recognize all characters on all pages. Default: ALL.

containPageBackgroundImage:Whether to include page background images during conversion; this setting is only effective when using OCR (0: disabled; 1: enabled). Default 1.

Image to RTF

Note: Different parameters can be used when uploading files for each specific function. The other steps remain consistent.

Image to RTF:

java
{
  "enableAiLayout": 1,
  "isContainImg": 1,
  "isContainAnnot": 1,
  "enableOcr": 0,
  "ocrRecognitionLang": "AUTO",
  "ocrOption": "ALL",
  "containPageBackgroundImage": 1
}

Required parameters

enableAiLayout: Whether to enable AI layout analysis (0: not enabled; 1: enabled). Default 1.

isContainImg: Whether to include images during conversion (0: not enabled; 1: enabled). Default 1.

isContainAnnot: Whether to include annotations during conversion (0: not enabled; 1: enabled). Default 1.

enableOcr: Whether to use OCR (0: disable; 1: enable). Default is 0.

ocrRecognitionLang: OCR recognition language,supported types and definitions: AUTO: Automatic, CHINESE: Simplified Chinese, CHINESE_TRAD: Traditional Chinese, ENGLISH: English, KOREAN: Korean, JAPANESE: Japanese, LATIN: Latin, DEVANAGARI: Devanagari, CYRILLIC: Cyrillic, ARABIC: Arabic, TAMIL: Tamil, TELUGU: Telugu, KANNADA: Kannada, THAI: Thai, GREEK: Greek, ESLAV: Slavic languages. Default is AUTO.

ocrOption: OCR recognition range, supported types and definitions:

INVALID_CHARACTER: Recognize illegal characters in PDF documents. SCAN_PAGE: Recognize scanned pages in PDF documents. INVALID_CHARACTERAND_SCAN_PAGE: Recognize illegal characters and scanned pages in PDF documents. ALL: Recognize all characters on all pages. Default: ALL.

containPageBackgroundImage:Whether to include page background images during conversion; this setting is only effective when using OCR (0: disabled; 1: enabled). Default 1.

Image to CSV

Note: You can use specific parameters for each functionality when uploading files, while the other steps remain the same.

Image to CSV:

java
{
  "enableAiLayout": 1,
  "isContainImg": 1,
  "isContainAnnot": 1,
  "enableOcr": 0,
  "ocrRecognitionLang": "AUTO",
  "excelWorksheetOption": "e_ForTable",
  "ocrOption": "ALL"
}

Required parameters

enableAiLayout: Whether to enable AI layout analysis (0: not enabled; 1: enabled). Default 1.

isContainImg: Whether to include images during conversion (0: not enabled; 1: enabled). Default 1.

isContainAnnot: Whether to include annotations during conversion (0: not enabled; 1: enabled). Default 1.

enableOcr: Whether to use OCR (0: Disable; 1: Enable). Default is 0.

ocrRecognitionLang: OCR recognition language,supported types and definitions: AUTO: Automatic, CHINESE: Simplified Chinese, CHINESE_TRAD: Traditional Chinese, ENGLISH: English, KOREAN: Korean, JAPANESE: Japanese, LATIN: Latin, DEVANAGARI: Devanagari, CYRILLIC: Cyrillic, ARABIC: Arabic, TAMIL: Tamil, TELUGU: Telugu, KANNADA: Kannada, THAI: Thai, GREEK: Greek, ESLAV: Slavic languages. Default is AUTO.

excelWorksheetOption: brief Excel Worksheet option. e_ForTable: A worksheet to contain only one table.; e_ForPage: A worksheet to contain table for PDF Page; e_ForDocument: A worksheet to contain table for PDF Document. Default e_ForTable.

ocrOption: OCR recognition range, supported types and definitions:

INVALID_CHARACTER: Recognize illegal characters in PDF documents. SCAN_PAGE: Recognize scanned pages in PDF documents. INVALID_CHARACTERAND_SCAN_PAGE: Recognize illegal characters and scanned pages in PDF documents. ALL: Recognize all characters on all pages. Default: ALL.

Image to TXT

Note: Different parameters can be used when uploading files for each specific function. The other steps remain consistent.

Image to TXT:

java
{
  "enableAiLayout": 1,
  "isContainImg": 1,
  "isContainAnnot": 1,
  "enableOcr": 0,
  "ocrRecognitionLang": "AUTO",
  "txtTableFormat": 1,
  "ocrOption": "ALL"
}

Required parameters

enableAiLayout: Whether to enable AI layout analysis (0: not enabled; 1: enabled). Default 1.

isContainImg: Whether to include images during conversion (0: not enabled; 1: enabled). Default 1.

isContainAnnot: Whether to include annotations during conversion (0: not enabled; 1: enabled). Default 1.

enableOcr: Whether to use OCR (0: not enabled; 1: enabled). Default is 0.

ocrRecognitionLang: OCR recognition language,supported types and definitions: AUTO: Automatic, CHINESE: Simplified Chinese, CHINESE_TRAD: Traditional Chinese, ENGLISH: English, KOREAN: Korean, JAPANESE: Japanese, LATIN: Latin, DEVANAGARI: Devanagari, CYRILLIC: Cyrillic, ARABIC: Arabic, TAMIL: Tamil, TELUGU: Telugu, KANNADA: Kannada, THAI: Thai, GREEK: Greek, ESLAV: Slavic languages. Default is AUTO.

txtTableFormat: Whether to format the table when converting pdf to txt (0: not enabled; 1: enabled). Default is 1.

ocrOption: OCR recognition range, supported types and definitions:

INVALID_CHARACTER: Recognize illegal characters in PDF documents. SCAN_PAGE: Recognize scanned pages in PDF documents. INVALID_CHARACTERAND_SCAN_PAGE: Recognize illegal characters and scanned pages in PDF documents. ALL: Recognize all characters on all pages. Default: ALL.