Text Search
ComPDFKit offers developers an API for programmatic full-text search.
To search text inside a document, create an instance of CPDFTextSearcher
, passing in the loaded CPDFTextPage
via its initializer. Searching can be triggered via calling CPDFTextSearcher.FindStart(CPDFTextPage textPage, string keyword, C_Search_Options searchOption,int startIndex)
. To do the searching, use function CPDFTextSearcher.FindNext(CPDFPage page,CPDFTextPage textPage,ref CRect rect,ref string content,ref int startIndex)
.
Before triggering a search, you can configure various search options:
- C_Search_Options::Search_Case_Insensitive
: Case insensitive.
- C_Search_Options::Case sensitive
: Case sensitive.
- C_Search_Options::Search_Match_Whole_Word
: Match whole word.
How to get the text area on a page by searching:
void SearchForPage(CPDFPage page,string searchKeywords, C_Search_Options option,ref List<Rect> rects, ref List<string> strings)
{
rects = new List<Rect>();
strings = new List<string>();
int findIndex = 0;
CPDFTextPage textPage = page.GetTextPage();
CPDFTextSearcher searcher = new CPDFTextSearcher();
if (searcher.FindStart(textPage, searchKeywords, option, 0))
{
CRect textRect = new CRect();
string textContent = "";
while (searcher.FindNext(page, textPage, ref textRect, ref textContent, ref findIndex))
{
strings.Add(textContent);
rects.Add(new Rect(textRect.left, textRect.top, textRect.width(), textRect.height()));
}
}
}
Text Selection
PDF text contents are stored in CPDFPage
objects which are related to a specific page. CPDFPage
class can be used to retrieve information about text in a PDF page, such as single character, single word, text content within specified character range or bounds and more.
How to get the text bounds on a page by selection:
void SelectForPage(CPDFPage page, Point fromPoint, Point toPoint, ref List<Rect> rects, ref string textContent)
{
CPDFTextPage textPage = page.GetTextPage();
textContent = textPage.GetSelectText(fromPoint, toPoint);
rects = textPage.GetCharsRectAtPos(fromPoint, toPoint, new Point(10, 10));
}