The code you posted are the verbatim snippets out of the PDF::Extract synopsis. I don't know what you think what they should do, but I tried the following program on the ECMA ECMAScript 1.3 standard available from mozilla.org and it did exactly what the documentation promised, it created a file E262-31..3.pdf, which I could open with Acrobat Reader, and the newly created document started with page one of the ECMA standard 262, with the words ECMAScript Language Specification, and ended with page 3, after the word Steve Leach.

#!/usr/bin/perl -w use strict; use PDF::Extract; # tested on http://www.mozilla.org/js/language/E262-3.pdf my $filename = 'E262-3.pdf'; my $pages = '1-3'; my $outputname = 'E262-31..3.pdf'; # see PDF::Extract documentation my $pdf = PDF::Extract->new(); print "Saving from $filename pages $pages to $outputname"; $pdf->savePDFExtract( PDFDoc => $filename, PDFPages => $pages ); print ",done.\n"; my $error = $pdf->getVars('PDFError'); warn $error if $error; if (-f $outputname) { print "There now exists a file '$outputname'\n"; } else { print "No file '$outputname' was found. Maybe there was some error?\ +n"; };

I am not sure what different results you expected and what else you tried. Maybe you have to reread the documentation, as neither of your examples seems to be about extracting ASCII text from PDF pages, but I don't know, as you seem to be trying to mix HTML and PDF, something which can't work.

perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web

In reply to Re: Re: (2) Extract text from PDF by Corion
in thread Extract text from PDF by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.