OCR

Overview

OCR (Optical Character Recognition) is the process of converting images of typed, handwritten, or printed text into machine-encoded text.

OCR is commonly used for text recognition and extraction from scanned PDF files, photographs of documents, scene photos, invoices, receipts, and other image-based documents.

The following features support OCR:

PDF to Word
PDF to Excel
PDF to PowerPoint
PDF to HTML
PDF to RTF
PDF to TXT
PDF to Searchable PDF
PDF to OFD
Extract PDF to JSON
Extract PDF to Markdown

Set OCR Language

Use languages to specify OCR languages. The value is an array of OCRLanguage constants.

ruby

options = ComPDFKitConversion::ConvertOptions.new
options.enable_ocr = true
options.languages = [
  ComPDFKitConversion::OCRLanguage::ENGLISH,
  ComPDFKitConversion::OCRLanguage::CHINESE
]

OCR Options

Use ocr_option to control OCR processing scope.

ruby

options.ocr_option = ComPDFKitConversion::OCROption::ALL

Preserve Page Background

When OCR is enabled, use contain_page_background_image to control whether page background images are preserved.

ruby

options.contain_page_background_image = true

Sample

ruby

ComPDFKitConversion::LibraryManager.set_document_ai_model("/path/to/documentai.model", -1)

options = ComPDFKitConversion::ConvertOptions.new
options.enable_ocr = true
options.languages = [ComPDFKitConversion::OCRLanguage::ENGLISH]
options.ocr_option = ComPDFKitConversion::OCROption::ALL

result = ComPDFKitConversion::Conversion.start_pdf_to_word(
  input_file_path,
  "",
  output_file_path,
  options
)

OCR ​

Overview ​

Set OCR Language ​

OCR Options ​

Preserve Page Background ​

Sample ​

OCR

Overview

Set OCR Language

OCR Options

Preserve Page Background

Sample