Windows
ComPDFKit PDF SDK
Guides

Text Search & Selection

 

Text Search

 

ComPDFKit offers developers an API for programmatic full-text search.

To search text inside a document, create an instance of CPDFTextSearcher, passing in the loaded CPDFTextPage via its initializer. Searching can be triggered via calling CPDFTextSearcher.FindStart(CPDFTextPage textPage, string keyword, C_Search_Options searchOption,int startIndex). To do the searching, use function CPDFTextSearcher.FindNext(CPDFPage page,CPDFTextPage textPage,ref CRect rect,ref string content,ref int startIndex).

 

Before triggering a search, you can configure various search options:

- C_Search_Options::Search_Case_Insensitive: Case insensitive.

- C_Search_Options::Case sensitive: Case sensitive.

- C_Search_Options::Search_Match_Whole_Word: Match whole word.

 

How to get the text area on a page by searching:

 

void SearchForPage(CPDFPage page,string searchKeywords, C_Search_Options option,ref List<Rect> rects, ref List<string> strings)
{
   rects = new List<Rect>();
   strings = new List<string>();
   int findIndex = 0;
​
   CPDFTextPage textPage = page.GetTextPage();
   CPDFTextSearcher searcher = new CPDFTextSearcher();
​
   if (searcher.FindStart(textPage, searchKeywords, option, 0))
  {
       CRect textRect = new CRect();
       string textContent = "";
       while (searcher.FindNext(page, textPage, ref textRect, ref textContent, ref findIndex))
      {
           strings.Add(textContent);
           rects.Add(new Rect(textRect.left, textRect.top, textRect.width(), textRect.height()));
      }
  }
}

 

Text Selection

 

PDF text contents are stored in CPDFPage objects which are related to a specific page. CPDFPage class can be used to retrieve information about text in a PDF page, such as single character, single word, text content within specified character range or bounds and more.

How to get the text bounds on a page by selection:

 

void SelectForPage(CPDFPage page, Point fromPoint, Point toPoint, ref List<Rect> rects, ref string textContent)
{
   CPDFTextPage textPage = page.GetTextPage();
   textContent = textPage.GetSelectText(fromPoint, toPoint);
   rects = textPage.GetCharsRectAtPos(fromPoint, toPoint, new Point(10, 10));
}