karthikasasanka has asked for the wisdom of the Perl Monks concerning the following question:

Hi Guys,

I have written a script which will read PDF files and writes the data into excel sheet using the module CAM::PDF(for reading PDF files). And It was working fine, but for some PDF files it was reading the file in encrypted format instead of text format.

can anyone please tell me if any module or a way to solve the issue?

thanks in advance.

Replies are listed 'Best First'.
Re: Reading PDF files
by marto (Cardinal) on Apr 22, 2011 at 13:32 UTC
Re: Reading PDF files
by elef (Friar) on Apr 22, 2011 at 17:31 UTC
    I'd just use XPDF. The author of CAM::Pdf himself says that the pdf->txt converter was an afterthought, not a major part of the project, and doesn't work that well with messy files. Pdftotext (part of xpdf) works better.
      Thanks for the response. the PDF file contains string '<</Length 6 0 R/Filter /FlateDecode>>' I think the PDF file is compressed with Deflate. So I tried below code
      use IO::Uncompress::Unzip qw(unzip $UnzipError) ; my $z = new IO::Uncompress::Unzip $in or die "unzip failed: $UnzipErro +r\n"; my $op = []; unzip $in => $op;
      And also tried using the AnyInflate and AnyUncompress. But the issue still pursues.

      can anyone tell me if you have idea about inflate and deflate.

      Thanks , Kartheek
Re: Reading PDF files
by bart (Canon) on Apr 22, 2011 at 12:18 UTC
    Are you sure it is encrypted and not simply gzipped?
Re: Reading PDF files
by Anonymous Monk on Apr 22, 2011 at 07:35 UTC
    but for some PDF files it was reading the file in encrypted format instead of text format. can anyone please tell me if any module or a way to solve the issue?

    This is a feature of PDF files, you need to supply the password/whatever to decrypt.