I've done some rudimentary parsing of PDF's using CAM::PDF's getPageText() method, but I was only able to deal with PDF v1.4 formatted files though (v1.5 and v1.6 I couldn't parse).
I have not done anything similar in Word, but there must be something around that performs a similar extraction function.
Once you've extracted each file, then you'd need to write the comparator function.
In reply to Re: Comparison word against pdf
by thezip
in thread Comparison word against pdf
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |