in reply to Comparison word against pdf
see Parsing PDFs by text position?
Cheers Rolf
( addicted to the Perl Programming Language)
¹) some PDF formats use internal bitmaps for fonts such that only OCR would help.