Updates

ComPDFKit Conversion SDK 1.10.0: Convert Scanned PDFs to Editable PDFs

By ComPDFKit | Thu. 01 Feb. 2024
ReleaseData ExtractionConversion SDK

We're thrilled to announce the arrival of ComPDFKit Conversion SDK V1.10.0 for Windows, Mac, iOS, Android, and Linux! This update brings an exciting new feature - the ability to convert scanned PDFs into editable and searchable PDFs on Mac and Windows. Additionally, you can now effortlessly convert PNG and JPG images into popular file formats such as Word, Excel, PPT, HTML, TXT, CSV, and RTF . Furthermore, this version enables you to extract text, images, and tables all at once during the PDF data extraction process.

 

Convert Scanned PDFs to Editable and Searchable PDFs

 

convert scanned PDF to editor and searchable PDFs

 

In the previous release of ComPDFKit Conversion SDK V1.8.1, we introduced the OCR plugin feature. This plugin allows customers to easily integrate the powerful OCR functionality of ComPDFKit into their own applications. With this plugin, users can now enjoy OCR-enabled conversions, resulting in improved accuracy, especially for documents with images or scanned files. In this latest version, we have further enhanced our OCR capabilities by adding support for converting scanned PDFs into searchable and editable PDFs on Mac and Windows. Here's an introduction to its use cases, integration with other features, supported platforms, and supported languages:

 

Use Cases: The ComPDFKit OCR feature is designed to handle various types of scanned documents like contracts, research papers, medical bills, financial documents, and more. It quickly recognizes and organizes information from these paper-based files, helping industries solve the challenge of manual data entry, saving on labor costs, and improving efficiency.

 

Integration with Other Features: Once the scanned documents are converted into editable PDFs, you can take advantage of other ComPDFKit features such as text extraction, text selection, search and replace, adding annotations, and text editing. Furthermore, you can convert PDF files to other formats using OCR technology, enabling you to recognize text, images, tables, and other content within a document, thereby improving the accuracy of the conversion results.

 

Supported Platforms: ComPDFKit OCR is now available on multiple platforms including Windows, Mac , and Linux( C++, Java, Python, PHP). Please refer to our guides for detailed integration steps.

 

Supported Languages: ComPDFKit OCR supports nearly 50 languages including English, French, Japanese, Korean, German, Latin, Chinese, Italian, Spanish, and more. Regardless of the language you are working with, we can provide accurate OCR recognition.

 

Convert Images to Word, Excel, PPT, and more Formats

 

convert image(png/jpg) to word, excel, ppt, html, txt, csv, RTF

 

In this version, we have added a highly anticipated feature: the ability to convert PNG/JPG image files into various formats with OCR on Mac and Windows platforms, including Word, Excel, PPT, HTML, TXT, CSV, and RTF. These new conversion capabilities provide users with more choices and demonstrate their powerful utility in various scenarios.

 

Converting PNG/JPG to Word is particularly useful when you need to convert images from scanned paper documents, contracts, or reports into editable Word documents. By editing the recognized text content, you can easily make modifications, and copy and paste, saving you time and labor costs associated with manual input.

 

The image to Excel conversion feature excels in handling table data. Whether you need to extract tabular data from images or convert tables within images into editable Excel files, this functionality proves invaluable. When dealing with financial statements, sales data, research results, or any other tabular data, you can effortlessly import the data into Excel, enabling automated data analysis and processing.

 

For users who frequently create presentations, our PNG/JPG to PPT converter becomes an indispensable tool. When you need to insert images into a slideshow or convert images into slides, this functionality becomes paramount. By converting images into the PPT format, you can effortlessly insert them into your presentation, making your slides more dynamic and captivating.

 

Throughout all these conversion processes, we have implemented advanced OCR technology. This technology allows for precise and accurate recognition of text within images, transforming them into editable text, and ensuring the most accurate conversion results. By leveraging OCR technology, we provide users with higher levels of accuracy and editability, delivering an outstanding experience when converting image formats.

 

PDF Extractor SDK Improvements

 

ComPDFKit PDF Extractor

 

ComPDFKit's PDF data extraction capabilities have been improved a lot, such as selective page extraction, conversion cancellation support, and simultaneous text, image, and table extraction on all platforms including Mac, Windows, iOS, Android and Linux.

 

Enhanced Page Selection: We have added support for selecting specific pages when extracting data from PDFs. This allows users to extract and work with data from specific sections or pages within a PDF document, providing more flexibility and control over the extraction process.

 

Conversion Cancellation Support: We understand the importance of flexibility in document processing. With the newly added support for canceling conversions during PDF data extraction, users have the freedom to halt ongoing processes if needed. This feature empowers users to manage their tasks efficiently, saving time and resources by swiftly adjusting extraction processes as per their evolving requirements.

 

Simultaneous Text, Image, and Table Extraction: ComPDFKit's PDF data extraction function now supports the simultaneous extraction of text, images, and tables from PDF documents. This comprehensive approach ensures that characters, words, fonts, form fields, images, and data are extracted seamlessly into structured formats like JSON, XML, etc. This versatility facilitates concurrent extraction of various elements, promoting a more efficient and streamlined workflow.

 

More Details 

 

In addition to the three main features outlined above, this release also adds support for table recognition in the flowing text layout when converting PDF to Word, allows for performance OCR when converting PDF to CSV, improves the conversion effect when converting PDF to Excel and outputting the entire PDF document content in one worksheet, and fixes several other smaller issues.

 

To view a comprehensive list of changes, please refer to the ComPDFKit Conversion SDK 1.10.0 changelogs for Windows, Mac, iOS, and Android.