Hi monks,
As suggested i am trying to use CAM::PDF for extracting text from pdf and ppt documents.
I installed CAM::PDF in my ubuntu sys and i run the following script.
!/usr/bin/perl
use strict;
use warnings;
use CAM::PDF;
use CAM::PDF::PageText;
my $filename = shift || die "Supply pdf on command line\n";
my $pdf = CAM::PDF->new($filename);
print text_from_page(1);
sub text_from_page {
my $pg_num = shift;
return
CAM::PDF::PageText->render($pdf->getPageContentTree($pg_num));
}
when i run this code with page no set to 1. it brings all the text from 1page. But when i change the page to 2nd. It says the following.
Failed to open filter FlateDecode (Text::PDF::FlateDecode)
Unrecognized type in parseAny:
1 ڵZYs��~_� V���%&
+#65533;����K�N��Q
+5533;Jy��9$a...
Why is that occurs.. plz anyone let me know..