Skip to content
ComPDF

OCR

Overview

OCR (Optical Character Recognition) is the process of converting images of typed, handwritten, or printed text into machine-encoded text.

OCR is commonly used for text recognition and extraction from scanned PDF files, photographs of documents, scene photos, invoices, receipts, and other image-based documents.

The following features support OCR:

  • PDF to Word
  • PDF to Excel
  • PDF to PowerPoint
  • PDF to HTML
  • PDF to RTF
  • PDF to TXT
  • PDF to Searchable PDF
  • PDF to OFD
  • Extract PDF to JSON
  • Extract PDF to Markdown

Set OCR Language

Use languages to specify OCR languages. The value is an array of OCRLanguage constants.

ruby
options = ComPDFKitConversion::ConvertOptions.new
options.enable_ocr = true
options.languages = [
  ComPDFKitConversion::OCRLanguage::ENGLISH,
  ComPDFKitConversion::OCRLanguage::CHINESE
]

OCR Options

Use ocr_option to control OCR processing scope.

ruby
options.ocr_option = ComPDFKitConversion::OCROption::ALL

Preserve Page Background

When OCR is enabled, use contain_page_background_image to control whether page background images are preserved.

ruby
options.contain_page_background_image = true

Sample

ruby
ComPDFKitConversion::LibraryManager.set_document_ai_model("/path/to/documentai.model", -1)

options = ComPDFKitConversion::ConvertOptions.new
options.enable_ocr = true
options.languages = [ComPDFKitConversion::OCRLanguage::ENGLISH]
options.ocr_option = ComPDFKitConversion::OCROption::ALL

result = ComPDFKitConversion::Conversion.start_pdf_to_word(
  input_file_path,
  "",
  output_file_path,
  options
)