OCR
Overview
OCR (Optical Character Recognition) is the process of converting images of typed, handwritten, or printed text into machine-encoded text.
OCR is commonly used for text recognition and extraction from scanned PDF files, photographs of documents, scene photos, invoices, receipts, and other image-based documents.
The following features support OCR:
- PDF to Word
- PDF to Excel
- PDF to PowerPoint
- PDF to HTML
- PDF to RTF
- PDF to TXT
- PDF to Searchable PDF
- PDF to OFD
- Extract PDF to JSON
- Extract PDF to Markdown
Set OCR Language
Use languages to specify OCR languages. The value is an array of OCRLanguage constants.
ruby
options = ComPDFKitConversion::ConvertOptions.new
options.enable_ocr = true
options.languages = [
ComPDFKitConversion::OCRLanguage::ENGLISH,
ComPDFKitConversion::OCRLanguage::CHINESE
]OCR Options
Use ocr_option to control OCR processing scope.
ruby
options.ocr_option = ComPDFKitConversion::OCROption::ALLPreserve Page Background
When OCR is enabled, use contain_page_background_image to control whether page background images are preserved.
ruby
options.contain_page_background_image = trueSample
ruby
ComPDFKitConversion::LibraryManager.set_document_ai_model("/path/to/documentai.model", -1)
options = ComPDFKitConversion::ConvertOptions.new
options.enable_ocr = true
options.languages = [ComPDFKitConversion::OCRLanguage::ENGLISH]
options.ocr_option = ComPDFKitConversion::OCROption::ALL
result = ComPDFKitConversion::Conversion.start_pdf_to_word(
input_file_path,
"",
output_file_path,
options
)