NodeReaper has asked for the wisdom of the Perl Monks concerning the following question:

This node was taken out by the NodeReaper on Fri Apr 2 05:05:34 2004 (EST)

Replies are listed 'Best First'.
Re: Extract text from PDF files
by Anonymous Monk on Apr 03, 2004 at 04:17 UTC

    hai


    use CPAN module, PDF::Extract

      I looked at the homepage for that module & it says "PDF::Extract works on the file structure of a PDF document not the content. There are plenty of PDF modules that can do that." Nothing in the module seems to grab the text for parsing it. I also couldn't find any modules to which the quote alludes that performs that task. If you've been able to do this, I'd be grateful if you could send a code snippet showing how it works.