in reply to Read PDF files & do regex through Perl.

A PDF file is not a plain text file. It is a fairly complex binary format, so reading it with normal line-oriented I/O will not work.

Look into the PDF-oriented modules on cpan (http://search.cpan.org/search?mode=module&query=PDF), or for PDF tools on Freshmeat.net, which you could use to pre-process the PDF, extracting the parts you want, which may then be handled by the Perl script.

--rjray

  • Comment on Re: Read PDF files & do regex through Perl.