However, as has been said about .pdf conversion/testing/whatever -- in many threads here in the Monastery and elsewhere -- the standard for such docs has changed many times; many creators ignore or bork their implementation of the standards; and what you get may not match the output using a different source.
You may wish to check CAM::PDF and especially http://search.cpan.org/~cdolan/CAM-PDF-1.60/bin/getpdftext.pl if your desire to go further -- for example, to text extraction... but as the author, Chris Dolan http://search.cpan.org/~cdolan/ has noted (in paraphrase) extracting information from some .pdf formats is a real pain.
In reply to Re: use PDF; woes
by ww
in thread use PDF; woes
by zeltus
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |