blank pdf generated using PDF::API2

lennelei has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

I've a very small Perl script that splits PDF files bigger than 100 pages :

use strict;
use warnings;
use PDF::API2;
#...some code here...
#for testing purpose:
my $file='path_to_some.pdf';
#

my $oldpdf = PDF::API2->open($file);
if ($oldpdf->pages > 100) {
    my $newpdf = PDF::API2->new;
    printf " (%d pages)\n", $oldpdf->pages;
    for my $page_nb (1..10) {
        $newpdf->importpage($oldpdf, $page_nb);
    }
    $newpdf->saveas("_$file");
}
[download]

I'm running this on Windows (Windows 7 for my desktop, Windows 2008/2012 for the servers) with a Strawberry Perl 5.14 and PDF::API2 module installed using cpan.bat

It's working and used for weeks now without trouble until this week. With a pdf received a few days ago, the script output is a 100 blank pages document.

I tried using the alternative with importPageIntoForm by snoopy from http://www.perlmonks.org/?node_id=615492 with the same result.

I also tried another tool (sejda) and the pages are correctly extracted so it's probably an issue with PDF::API2 or a misconfiguration but I don't know what to add/change in the script.

FYI, the sejda command line:

sejda-console.bat extractpages -f SOURCE.PDF -o TARGET.PDF -s 1-100

Any idea/alternative I could try? I'd like to keep the Perl as this is only a small part of a bigger script, but if I have no other option, I'll use sejda for the split.

Unfortunately, I cannot provide the PDF :(

Thank you

Edit : I just found and tried with CAM::PDF using the following code and it's working!

For what I've seen, the difference between both code is that PDF::API2->import_page function tries to copy the content of the pages where CAM::PDF->extractPages function removes the pages outside the given range. Maybe there is a similar method in PDF::API2 but I couldn't find it yet?

use strict;
use warnings;
use CAM::PDF;
#...some code here...
#for testing purpose:
my $file='path_to_some.pdf';
#

my $oldpdf = CAM::PDF->new($file) or die "$CAM::PDF::errstr\n";
if ($oldpdf->numPages() > 100) {
    printf " (%d pages)\n", $oldpdf->numPages();
    $oldpdf->extractPages(1..100);
    $oldpdf->cleanoutput("_$file");
}
[download]

Comment on blank pdf generated using PDF::API2 Select or Download Code

Replies are listed 'Best First'.
Re: blank pdf generated using PDF::API2 by thanos1983 (Parson) on Jul 21, 2017 at 08:16 UTC
Hello lennelei Welcome to the mnonastery. Try to use PDF::API2/PAGE METHODS/import_page(), it should work I tested on mine. Sample of working code: Update: Minor note there is no `importpage()` method you probably mean `import_page()` which is working as expected, see sample code bellow. `#!/usr/bin/perl use strict; use warnings; use PDF::API2; my $file='test.pdf'; my $newpdf = PDF::API2->new(); my $oldpdf = PDF::API2->open($file); if ($oldpdf->pages() > 1) { printf " (%d pages)\n", $oldpdf->numPages(); for my $page_nb (1..8) { $newpdf->import_page($oldpdf, $page_nb, $page_nb); } $newpdf->saveas("test_2.pdf"); }` [download] Hope this helps, BR. Seeking for Perl wisdom...on the process of learning...not there...yet!	[reply] [d/l] [select]
Re^2: blank pdf generated using PDF::API2 by poj (Abbot) on Jul 21, 2017 at 08:44 UTC
Changes at Revison 2.022 2.022 2014-07-04 - Added $pdf->version() get/set method. When opening an existing PDF, the existing version number will now be retained. - Renamed the following in PDF::API2: - importpage to import_page - openScalar to open_scalar poj	[reply]
Re^3: blank pdf generated using PDF::API2 by thanos1983 (Parson) on Jul 21, 2017 at 09:10 UTC
Thanks poj I had no clue... :D Seeking for Perl wisdom...on the process of learning...not there...yet!	[reply] [d/l] [select]
Re^2: blank pdf generated using PDF::API2 by lennelei (Acolyte) on Jul 21, 2017 at 09:29 UTC
Thank you for your help and your welcome :) You're right for the method name, I don't know why I use importpage (I probably found it on an old example) ; however, both methods return the same result so it doesn't work either for my specific pdf. As I (tried to) explain, my script is working: we use it in a production environment for weeks now. There is only one file it failed to process correctly and for which the output is a 100 blank pages pdf.	[reply]
Re^3: blank pdf generated using PDF::API2 (Updated) by thanos1983 (Parson) on Jul 21, 2017 at 09:36 UTC
Hello again lennelei, Notice this line(`$newpdf->import_page($oldpdf, $page_nb, $page_nb);`) it uses 3 parameters not 2. Try to copy the example that I provided you and test it. Does it work?I simulate the scenario with a pdf that I have 7 pages and seems to be working just fine. ~~I do not have a pdf 100 pages so I can not really check it but give it a try I assume it should work.~~ Update: I just tested the sample of code that I provided you with a pdf of 123 pages. It works just fine. The only line that I modified is the for loop (`for my $page_nb (1..$oldpdf->pages())`). Update2: Full sample of executable code bellow: `#!/usr/bin/perl use strict; use warnings; use PDF::API2; use feature 'say'; my $file='test.pdf'; my $newpdf = PDF::API2->new(); my $oldpdf = PDF::API2->open($file); if ($oldpdf->pages() > 1) { say $oldpdf->pages() . ' pages.'; for my $page_nb (1..$oldpdf->pages()) { $newpdf->import_page($oldpdf, $page_nb, $page_nb); } $newpdf->saveas("test_2.pdf"); }` [download] Let us know if it works, BR. Seeking for Perl wisdom...on the process of learning...not there...yet!	[reply] [d/l] [select]
Re^4: blank pdf generated using PDF::API2 by hippo (Archbishop) on Jul 21, 2017 at 09:47 UTC
Re^5: blank pdf generated using PDF::API2 by thanos1983 (Parson) on Jul 21, 2017 at 10:14 UTC
Some notes below your chosen depth have not been shown here
Re^4: blank pdf generated using PDF::API2 (Updated) by lennelei (Acolyte) on Jul 21, 2017 at 10:16 UTC
Re^5: blank pdf generated using PDF::API2 (Updated) by AnomalousMonk (Archbishop) on Jul 21, 2017 at 10:53 UTC
Some notes below your chosen depth have not been shown here
Re^5: blank pdf generated using PDF::API2 (Updated) by thanos1983 (Parson) on Jul 21, 2017 at 10:50 UTC