Skip to content
ComPDF

Overview

When converting PDF documents into various formats, ComPDF Conversion SDK offers two common options: whether images are included in the generated document, and whether annotations from the PDF file are retained.

  • When contain_image is enabled, the SDK extracts images from the PDF document and embeds them in the corresponding pages and positions in the output file. For areas with overlapping images, the SDK merges these images into one image and embeds it at the correct location.
  • When contain_annotation is enabled, most annotations are converted into raster images and embedded at the corresponding positions. Certain types of annotations, such as highlights, underlines, strikeouts, and squiggly lines, are converted into native formatting equivalents in Word, PowerPoint, and HTML documents when possible.

These options are commonly used in the following conversions:

  • PDF to Word
  • PDF to Excel
  • PDF to PowerPoint
  • PDF to HTML
  • PDF to RTF
  • Extract PDF to JSON
  • Extract PDF to Markdown

Sample

c
CConvertOption option = CPDF_DefaultConvertOption();
option.contain_image = true;
option.contain_annotation = true;

CSDKErrorCode code = CPDF_StartPDFToWord(
    CPDF_TEXT("input.pdf"),
    CPDF_TEXT("password"),
    CPDF_TEXT("path/output.docx"),
    option,
    NULL);