in reply to Reading PDF files

PDF::API2 has a stringify method to extract the text from a pdf. It is a very easy to use module.

--traveler

Replies are listed 'Best First'.
Re: Re: Reading PDF files
by Helter (Chaplain) on Jul 21, 2003 at 15:14 UTC
    Giving this a whirl here at work, it seems that the pdfs on the site I'm trying to work with are malformed. I can open pdf files from other sites, but not the one I need to.

    Malformed PDF file PDF::API2::IOString=GLOB(0x2252cc) at C:/Perl/site/ +lib/PDF/API2/PDF/FileAPI.pm line 84.
    Acrobat must be less picky than this module as I can view them just fine in reader.

    It also seems the stringify function does not parse out the text, it looks like I get the same output I would get from a plain open() call.

    Thanks for the suggestion.
      i have had similar problems with various PDF->text utilities; they work for some PDFs, but not all.

      PDF::API2 is still in constant development; usually there are much more recent versions available at the sourceforge page or near there, than you would get from CPAN. building the latest version might get around the error you're getting. i use 0.3d67, which probably isn't even the most recent any more.