Skip to content
ComPDF

Intelligent Text Extraction

Combining traditional OCR with AI, ComPDF AI accurately detects and recognizes text in over 70 languages. It handles printed or handwritten text, regardless of orientation, and can extract text from documents, images, IDs, invoices, road signs, billboards, etc.

Extract the Chinese text from the image into a JSON file. For the interface, refer to [ComPDF AI API](/guides/idp/self-hosted-deployment/ComPDF AI-api)

Parameter executeType uses documentAI/ocr

Parameter parameter is as follows:

java
{
	"lang": "auto"
}

Required parameters

lang: Supported types and definitions are as follows:

  • auto: Automatically classify the language.

  • english: English.

  • chinese: Simplified Chinese.

  • chinese_tra: Traditional Chinese.

  • korean: Korean.

  • japanese: Japanese.

  • latin: Latin.

  • devanagari: Devanagari.

Supported input formats

  • PNG
  • JPG & JPEG
  • BMP

Supported output formats

  • JSON: Result file.