in reply to Help with PDF module

G'day BernieC,

Here's a quick comparison of PDF-related modules.

PDF
As ++marto states, 23 years old and Active bugs. Probably abandonware; I'd suggest avoiding this one.
CAM::PDF
Suggested by AM. This is 10 years old; has a lot of bugs; and probably abandonware. See "CAM::PDF Error: Expected identifier label" for example problem and discussion.
PDF::API2
I'm most familiar with this one. It was last updated just 6 months ago. I did a quick test for the page count you were trying (code mostly just copied from the SYNOPSIS). I'm not sure what else you might want; perhaps "METADATA METHODS" is of interest.
$ perl -E ' use strict; use warnings; use PDF::API2; my $pdf = PDF::API2->new(); my $font = $pdf->font("Helvetica-Bold"); for my $p (1 .. 10) { my $page = $pdf->page(); my $text = $page->text(); $text->font($font, 20); $text->position(200, 700); $text->text("Page: $p"); } $pdf->save("test.pdf"); ' $ file test.pdf test.pdf: PDF document, version 1.4, 10 pages $ perl -E ' use strict; use warnings; use PDF::API2; my $pdf = PDF::API2->open("test.pdf"); say "Page count: ", $pdf->page_count(); ' Page count: 10
PDF::Builder
Suggested by marto. It was last updated just 4 months ago. I hadn't encountered this one previously. It's SYNOPSIS is almost identical to PDF::API2's. It may be a branch of PDF::API2 that's intended to provide improvements or enhancements; it mentions PDF::API2 a few times by way of comparison; I didn't see anything regarding a branch but I also didn't study the docs in detail. It has "METADATA METHODS" too.

See its README.md for possible hurdles to using this, such as requiring Perl v5.24; having said that, it installed first time for me using the cpan utility (I have Perl v5.36.0).

Given the similarities, I just repeated the test I did previously, replacing API2 with Builder and test.pdf with test2.pdf. At least in this respect, PDF::API2 and PDF::Builder function identically. If anyone has other information re PDF::API2 vs. PDF::Builder, please add comments.

$ perl -E ' use strict; use warnings; use PDF::Builder; my $pdf = PDF::Builder->new(); my $font = $pdf->font("Helvetica-Bold"); for my $p (1 .. 10) { my $page = $pdf->page(); my $text = $page->text(); $text->font($font, 20); $text->position(200, 700); $text->text("Page: $p"); } $pdf->save("test2.pdf"); ' $ file test2.pdf test2.pdf: PDF document, version 1.4, 10 pages perl -E ' use strict; use warnings; use PDF::Builder; my $pdf = PDF::Builder->open("test2.pdf"); say "Page count: ", $pdf->page_count(); ' Page count: 10

Update (additional information): I just noticed that the PDF produced by PDF::Builder is substantially bigger than that produced by PDF::API2. I would have expected them to be almost the same size.

ken@titan ~/tmp/pm_11152014_pdf $ ls -l total 16 -rw-r--r-- 1 ken None 4272 May 7 00:50 test.pdf -rw-r--r-- 1 ken None 7024 May 7 01:05 test2.pdf

— Ken

Replies are listed 'Best First'.
Re^2: Help with PDF module [comparison]
by kcott (Archbishop) on May 07, 2023 at 00:48 UTC
    "If anyone has other information re PDF::API2 vs. PDF::Builder, please add comments."

    Thanks to a private message from pryrt: "PDF::Builder::Docs - additional documentation for Builder module". (Pity you can't give a /msg a ++.)

    In particular, the History section which describes PDF::API2PDF::Builder: similar to what I guessed, but it's a lot more involved.

    "... repeated the test ... replacing API2 with Builder ..."

    Apparently, not just a bit of luck. From the same section: 'At least initially, any program written based on PDF::API2 should be convertible to PDF::Builder simply by changing "API2" anywhere it occurs to "Builder".' — so, for anyone wishing to upgrade applications from PDF::API2 to PDF::Builder, it's possibly as easy as a simple global change: s/API2/Builder/g.

    — Ken

Re^2: Help with PDF module [comparison]
by Anonymous Monk on May 08, 2023 at 10:35 UTC
    Here's a quick comparison of PDF-related modules

    Wait, I don't see any beyond (incomplete) enumeration, looks like you forgot to append results. Oh, then, here they are (some of them for a start), and using simple test file generated with code you kindly provided:

    use strict; use warnings; use PDF::API2; use CAM::PDF; use Benchmark 'cmpthese'; my $fn = 'test.pdf'; my $str = do { local ( @ARGV, $/ ) = $fn; <> }; cmpthese -1, { 'PDF::API2' => sub { PDF::API2-> from_string( $str )-> page_count }, 'CAM::PDF' => sub { CAM::PDF-> new( $str )-> numPages }, 'CAM::PDF+' => sub { my $d = CAM::PDF-> new( $str ); $d-> cacheObjects; $d-> numPages }, }; __END__ Rate PDF::API2 CAM::PDF+ CAM::PDF PDF::API2 163/s -- -74% -95% CAM::PDF+ 614/s 277% -- -83% CAM::PDF 3586/s 2106% 484% --

    The 'plussed' entry (kind of "parse everything") is for those who may have (reasonable) doubts if perhaps one module (guess which) makes harder effort to extract a lot more info initially, to provide a user with richer environment to inspect things more cosily (or something like that); but in fact, they (nonplussed) seem both to extract approximately same amount of info on open. It's just one parser (guess which) is very poor indeed.