in reply to Re^2: read pdf text in hidden layer?
in thread read pdf text in hidden layer?

Really? Seems to work ok for me. Quick test:
#!/usr/bin/perl use strict; use warnings; use CAM::PDF; my $input='E:\vecguid.pdf'; my $output='E:\Test.pdf'; my $pdf = CAM::PDF->new($input) or die "$CAM::PDF::errstr\n"; $pdf->output($output);

Did you try any of the examples scripts?

Martin

Replies are listed 'Best First'.
Re^4: read pdf text in hidden layer?
by leocharre (Priest) on May 11, 2007 at 12:31 UTC

    Wow thanks, this made a big difference. I didn't see these examples mentioned in the CAM::PDF doc- These are really useful utilities.

    I still hold to what I said earlier. I think docs/errors need more 'basic stuff'.

    I have a pdf I open and I get this error 'Expected stream open tag' . It's a die(), with no tangible info- for example, what method died? What module? ( It's from CAM::PDF::parseStream(), I had to find xargs grep for it- that's fine- I can do that- but not everyone should have to. ) - Some Carp::confess() would be nice here.

    pdftotext from xpdf works fine with that same file CAM::PDF choaks on. Dunno why. The PDF may be corrupt.

    Thank you for pointing these out!

    update

    It seems CAM::PDF::parseStream() expects a pdf stream tag to be followed by a newline.. Some of these files going between NTFS/ext3 seem to have funnied up with the newline .. (you know the old story with ftp binmode.. )- so maybe xpdf allows for ^M to be a newline, but not CAM::PDF? Maybe I'll write author.