in reply to Re^2: blank pdf generated using PDF::API2
in thread blank pdf generated using PDF::API2

Hello again lennelei,

Notice this line($newpdf->import_page($oldpdf, $page_nb, $page_nb);) it uses 3 parameters not 2. Try to copy the example that I provided you and test it. Does it work?I simulate the scenario with a pdf that I have 7 pages and seems to be working just fine.

I do not have a pdf 100 pages so I can not really check it but give it a try I assume it should work.

Update: I just tested the sample of code that I provided you with a pdf of 123 pages. It works just fine. The only line that I modified is the for loop (for my $page_nb (1..$oldpdf->pages())).

Update2: Full sample of executable code bellow:

#!/usr/bin/perl use strict; use warnings; use PDF::API2; use feature 'say'; my $file='test.pdf'; my $newpdf = PDF::API2->new(); my $oldpdf = PDF::API2->open($file); if ($oldpdf->pages() > 1) { say $oldpdf->pages() . ' pages.'; for my $page_nb (1..$oldpdf->pages()) { $newpdf->import_page($oldpdf, $page_nb, $page_nb); } $newpdf->saveas("test_2.pdf"); }

Let us know if it works, BR.

Seeking for Perl wisdom...on the process of learning...not there...yet!

Replies are listed 'Best First'.
Re^4: blank pdf generated using PDF::API2
by hippo (Archbishop) on Jul 21, 2017 at 09:47 UTC
    I do not have a pdf 100 pages

    You mean you don't have an e-copy of Modern Perl?

    Your code (with some adjustments for my elderly version of PDF::API2) works fine on my first edition of Modern Perl (186 pages).

      Hello hippo,

      I have updated the script and tested it with 123 pages. The new pdf is just fine I do not understand why for lennelei is not working.

      What is the updates that you applied? Your code (with some adjustments for my elderly version of PDF::API2) works fine on my first edition of Modern Perl (186 pages).

      BR, Thanos.

      Seeking for Perl wisdom...on the process of learning...not there...yet!

        Here is the code with the changes to enable it to work with PDF::API2 2.020 (and removing the fixed page count)

        #!/usr/bin/perl use strict; use warnings; use PDF::API2; my $file='test.pdf'; my $newpdf = PDF::API2->new(); my $oldpdf = PDF::API2->open($file); if ($oldpdf->pages() > 1) { printf " (%d pages)\n", $oldpdf->pages(); for my $page_nb (1..$oldpdf->pages) { $newpdf->importpage($oldpdf, $page_nb, $page_nb); } $newpdf->saveas("test_2.pdf"); }

        One suspects that there is something peculiar about lennelei's PDF other than the number of pages which is causing it to have trouble. Without the real input data it will be hard to debug.

        Update: It appears that my spidey-sense is again uncannily prescient.

Re^4: blank pdf generated using PDF::API2 (Updated)
by lennelei (Acolyte) on Jul 21, 2017 at 10:16 UTC

    Using 3 parameters produces exactly the same result with my pdf (I use your script without the printf " (%d pages)\n", $oldpdf->numPages(); line which is for CAM::PDF). I'd like to provide the pdf file so that you can try but I have to remove sensible information in it before ; and I don't really know how to do this for the moment :)

    Thank you again!

      I'd like to provide the pdf file ... I don't really know how ...

      How about going the other way? What happens when you run your code (or, indeed, the other monks' code) against some 100+ page document they seem to be having success with, e.g., Modern Perl?


      Give a man a fish:  <%-{-{-{-<

        First of all, I'm sorry if everything is not clear: English is not my native language.

        If you want the full story, here it is (you can skip that part :). We (my company) have a specific OCR software to handle PDF bills. I made a Perl script that extract pdf files from emails or retrieve them from our MFT software, rename them for normalization. As I learned that our OCR software doesn't work well with big pdf, I tried to add some code to the script to check if a file is more than 100 pages and, in that case, keeps only the first 100 pages (and drop the rest).

        As I didn't want to bother you with all the details, I only keep the part that cut the pdf in my first post.

        To resume: for any pdf, I need to keep at most the 100 first pages (if the pdf is 15 pages, I leave it untouched ; if it's 654 pages, I create a new pdf with the pages 1 to 100 included).

        ---- End of the story ----

        Once again, my script is working (99.9% of the time): my problem is not how to write it but why did it fails for one (only one) pdf and what can I do (if I can do something)!

        I didn't try the script against the "Modern Perl" file because unfortunately, I don't have it (yet), but I have lot of 100+ pages pdf (up to 600 pages) and they are all (but one) correctly processed by my script.

        I would like to provide you this specific pdf which has probably something that prevents PDF::API2 to process it correctly but I cannot as it contains customers information (I'm looking for a way to obfuscate the content).

        What's strange is that I managed to extract 100 pages from this specific pdf using sejda or CAM::PDF and the extractPages method.

        But with PDF::API2, it's not working.

      Hello lennelei,

      How many pages of the old file you to keep? Hold on, are you trying to split the old file into sets of new pdfs of 10 pages each? If so try something like this.

      #!/usr/bin/perl use strict; use warnings; use PDF::API2; use Data::Dumper; my $file = 'test.pdf'; my $oldpdf = PDF::API2->open($file); my @steps = map { 10 * $_ } 1 .. 10; # print Dumper \@steps; if ($oldpdf->pages() > 100) { my $num = 0; my $last_Step = 1; for my $step (@steps) { my $newpdf = PDF::API2->new(); for my $page_nb ($last_Step .. $step) { $newpdf->import_page($oldpdf, $page_nb, $num); } $num++; $newpdf->saveas("pdf/$num"."_"."$file"); $last_Step = $step; } } __END__ $ ll pdf/ total 5736 drwxrwxr-x 2 tinyos tinyos 4096 Jul 21 12:49 ./ drwxrwxr-x 8 tinyos tinyos 4096 Jul 21 12:48 ../ -rw-rw-r-- 1 tinyos tinyos 980911 Jul 21 12:49 10_test.pdf -rw-rw-r-- 1 tinyos tinyos 274610 Jul 21 12:49 1_test.pdf -rw-rw-r-- 1 tinyos tinyos 508740 Jul 21 12:49 2_test.pdf -rw-rw-r-- 1 tinyos tinyos 340428 Jul 21 12:49 3_test.pdf -rw-rw-r-- 1 tinyos tinyos 355785 Jul 21 12:49 4_test.pdf -rw-rw-r-- 1 tinyos tinyos 216205 Jul 21 12:49 5_test.pdf -rw-rw-r-- 1 tinyos tinyos 505735 Jul 21 12:49 6_test.pdf -rw-rw-r-- 1 tinyos tinyos 248888 Jul 21 12:49 7_test.pdf -rw-rw-r-- 1 tinyos tinyos 1027594 Jul 21 12:49 8_test.pdf -rw-rw-r-- 1 tinyos tinyos 1387582 Jul 21 12:49 9_test.pdf

      Unless if I miss understood. :) Give us a description step by step what you are trying to achieve. You have a file with 100 pages and you want to create a new pdf, of how many pages of the original file?

      Seeking for Perl wisdom...on the process of learning...not there...yet!