in reply to Re: (2) Extract text from PDF
in thread Extract text from PDF

The code you posted are the verbatim snippets out of the PDF::Extract synopsis. I don't know what you think what they should do, but I tried the following program on the ECMA ECMAScript 1.3 standard available from mozilla.org and it did exactly what the documentation promised, it created a file E262-31..3.pdf, which I could open with Acrobat Reader, and the newly created document started with page one of the ECMA standard 262, with the words ECMAScript Language Specification, and ended with page 3, after the word Steve Leach.

#!/usr/bin/perl -w use strict; use PDF::Extract; # tested on http://www.mozilla.org/js/language/E262-3.pdf my $filename = 'E262-3.pdf'; my $pages = '1-3'; my $outputname = 'E262-31..3.pdf'; # see PDF::Extract documentation my $pdf = PDF::Extract->new(); print "Saving from $filename pages $pages to $outputname"; $pdf->savePDFExtract( PDFDoc => $filename, PDFPages => $pages ); print ",done.\n"; my $error = $pdf->getVars('PDFError'); warn $error if $error; if (-f $outputname) { print "There now exists a file '$outputname'\n"; } else { print "No file '$outputname' was found. Maybe there was some error?\ +n"; };

I am not sure what different results you expected and what else you tried. Maybe you have to reread the documentation, as neither of your examples seems to be about extracting ASCII text from PDF pages, but I don't know, as you seem to be trying to mix HTML and PDF, something which can't work.

perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web