holli has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monasterians, i am about to go crazy. The task is to merge multiple PDF-Files to one. Seems easy enough (hey, there is CPAN) and so i wrote the following algorithm:
use strict; use PDF::API2; my $pdf = PDF::API2->new( -file => "path/to/outfile.pdf" ); my @files = glob ("path/to/somewhere/else/*.pdf"); merge (@files); # some stuff sub merge { my ( $file, $root, ); $root = $pdf->outlines; my $import_page = 0; my $document_page = 0; foreach $file ( @_ ) { my ($inputpdf, $inputdir) = fileparse ($file); my $input = PDF::API2->open( $file ); my @pages = 1 .. $input->pages; if ( scalar @pages > 0 ) { my $outline; $outline = $root->outline if $bookmark; foreach ( @pages ) { ++$import_page; ++$document_page; my $page = $pdf->importpage($input, $_, $import_page); if ( $bookmark ) { my ($bmtext) = ($inputpdf =~ /([^\.]+)/ ); $outline->title($bmtext); my $bm = $outline->outline; $bm->title("Seite $document_page"); $bm->dest($page); $outline->dest($page) if $document_page == 1; $outline->closed; } } } } $pdf->preferences( -outlines => 1 ) if $bookmark; $pdf->update; $pdf->end; }
This works for the first iteration of the merging. For example i merge first x/1.pdf and x/2.pdf to ./x.pdf. Second y/1.pdf and y/2.pdf to ./y.pdf. Both resulting files look fine in Acrobat Reader and Ghostview. I can print, copy&paste, etc.

The problem arises when i trie to merge x.pdf and y.pdf to z.pdf. Doing so will result in a new file with the expected number of pages, but all pages are blank and i get 2 error-messages (from Acrobat Reader):
1) A problem occured while reading the document (14)
2) A font is not listed in the resource-dictionary. Using Helvetica

Sometimes it also complains "Graphical Resource AR+B9 not found.

Opening the file with Ghostview i get this error:
GSview 4.6 2004-01-11 AFPL Ghostscript 8.14 (2004-02-20) Copyright (C) 2004 artofcode LLC, Benicia, CA. All rights reserved. This software comes with NO WARRANTY: see the file PUBLIC for details. Scanning PDF file %GSVIEW_PDF_PAGES: 1 19 Displaying PDF page 1 %GSVIEW_PDF_PAGE: 1 %GSVIEW_PDF_MEDIA: [0 0 594.9 841.36] %GSVIEW_PDF_ROTATE: 0 Error: /invalidfont in /AB+F0 Operand stack: --dict:4/4(L)-- AB+F0 7.92 Execution stack: %interp_exit .runexec2 --nostringval-- --nostringval-- --no +stringval-- 2 %stopped_push --nostringval-- --nostringval-- + false 1 %stopped_push 1 3 %oparray_pop 1 3 %oparray_ +pop 1 3 %oparray_pop 1 3 %oparray_pop .runexec2 --nos +tringval-- --nostringval-- --nostringval-- 2 %stopped_push +--nostringval-- --nostringval-- --nostringval-- --nostringval-- + --nostringval-- %array_continue --nostringval-- false 1 +%stopped_push --nostringval-- %loop_continue --nostringval-- Dictionary stack: --dict:1120/1686(ro)(G)-- --dict:0/20(G)-- --dict:78/200(L)-- + --dict:104/127(ro)(G)-- --dict:238/347(ro)(G)-- --dict:20/24(L)- +- --dict:4/6(L)-- --dict:20/20(L)-- --dict:1/1(ro)(G)-- --dic +t:1/1(ro)(G)-- --dict:1/1(ro)(G)-- --dict:9/13(L)-- Current allocation mode is local Last OS error: No such file or directory pdf_page failed
Can somenone help me to solve this issue? I have no clue where to start.

Update:
Corrected copy&paste errors
Update:
Followed dragonchilds critics

holli, regexed monk

Replies are listed 'Best First'.
Re: repeated merging of PDF-Files
by dragonchild (Archbishop) on Feb 03, 2005 at 13:29 UTC
    • Where is $pdf created? You're not showing all the relevant code ...
    • Why are you using global variables? Why aren't you using strict?
    • What happens when you try and merge three files at one time, as your code seems to be able to do?

    Being right, does not endow the right to be rude; politeness costs nothing.
    Being unknowing, is not the same as being stupid.
    Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
    Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

      1) $pdf is an PDF::API2-object that is created outside the subroutine as a package-global.
      2) I donīt consider globals to be bad in short scripts. I am using strict, but i did post only the relevant subroutine.
      3) The same thing. The error does not relate to the number of documents.
      4) I will update my code-section.

      holli, regexed monk
        Ok ... This sounds like a bug in PDF::API2. I would build a test case with 4 very simple PDF files and open a bug at http://rt.cpan.org. It's a very interesting bug, I'll say that ...

        Being right, does not endow the right to be rude; politeness costs nothing.
        Being unknowing, is not the same as being stupid.
        Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
        Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

Re: repeated merging of PDF-Files
by knowmad (Monk) on Feb 03, 2005 at 14:02 UTC

    Hi holli,

    I can't help you solve the issue with PDF::API2. I'm using PDF::Reuse which does not rely on this package. You may want to see if you can work it into merge subroutine to see if you still get the same errors.

    Right now I'm having a problem with merging some data. I've been in touch with the author and it should hopefully be resolved soon.

    Good luck,
    William

      I have already tried that module and ran into another bug. Some documents created via PDFWriter are cut off a cm at the top.

      But the code is much shorter ,)
      prFile ("out.pdf"); for ( @pdf ) { prDoc ($_); } prEnd;

      holli, regexed monk

        I had a somewhat similar problem because the incoming pdf was 8.5x11. I used prMbox to set the page margins. The default seems to be US Letter which is probably incorrect for your PDFs (which I'm guessing are A4 since you're metric measurements).

        William

Re: repeated merging of PDF-Files
by AztecMonkey (Initiate) on Feb 03, 2005 at 16:10 UTC
    Have you tried viewing this in different versions of Acrobat?

    I am having a similar problem with importing an existing pdf and inserting text on a portion of a specific page. It displays just fine in Acrobat 6 and earlier, but the imported pages are blank in Acrobat 7, although the fonts from the imported document are embedded in the API2 created pdf. I am assuming that the problem is in the import function of PDF::API2, because, Acrobat 7 will open the template pdf just fine. I will post anything else I find.

Re: repeated merging of PDF-Files
by blazar (Canon) on Feb 03, 2005 at 13:32 UTC
    Fellow monasterians, i am about to go crazy. The task is to merge multiple PDF-Files to one. Seems easy enough (hey, there is CPAN) and so i wrote the following algorithm:
    Totally OT, but you may want to search CTAN instead. The pdfpages package come to mind, for example. I haven't the slightest idea of how it's implemented, nay I only have heard it mentioned - quite frequently and with good cmts!
Re: repeated merging of PDF-Files
by AztecMonkey (Initiate) on Feb 16, 2005 at 14:51 UTC
    Don't know yet if this works, but the latest build (PDF-API2-0.40.91) purports to solve this problem. Will test and post results.