SmugX has asked for the wisdom of the Perl Monks concerning the following question:

Hello all; I'm after some pointers:

I have a requirement to produce mail merges with data in a "hierarchical" format - e.g. printing letters to multiple customers that have ordered multiple photographs, and perhaps have ordered multiple albums that contain multiple photographs ... the result maybe something like this:

Dear Bob A, Here are details on the photos you've ordered: Picture 1 (8"x6") Picture 2 (8"x6") Picture 3 (10"x8") Prints in a 10"x8" Red Album ------------------------------- Picture 4, Picture 5, Picture 6 [repeat etc.] Yours, Bob B
My first thought was to put the data into XML, and use Template Toolkit to render the result. The problem is that the result will be paper-based, so HTML isn't an ideal output format. I see that I could go through TT, via LaTex to PDF or Postscript, but I ideally wanted to be able to hand over the letter for less techy people to amend, and LaTex didn't seem ideal for that. (The users are used to software like MS Word and Adobe Acrobat ...)

So I then wondered if I could get a PDF template knocked up, perhaps embedding Template-Toolkit code on the page, and then I could use with PDF::API2 to extract the text, parse with TT, replace the text and render a new PDF. However, extracting the text that exists in a PDF seems to be difficult - I can't find any example code anywhere.

So this is where I've got. Does anyone have any pointers to finish off any of my lines of enquiry, or alternatively any completely different solutions that may meet my needs? (I have no desperate desire to write this myself if someone's already done it for me! :))

Many thanks for any advice, Neil.

Replies are listed 'Best First'.
Re: Pointers on an XML mail merge solution?
by idsfa (Vicar) on May 12, 2004 at 15:00 UTC

    From the docs:

    use PDF::API2; $pdf = PDF::API2->open('existing.pdf'); $string = $pdf->stringify;

    Update: Original code was bad (it truncated the file). Removed rather than <strike>'d because no one had followed up yet. For posterity, I had stupidly used the new method instead of open.


    If anyone needs me I'll be in the Angry Dome.
      Thanks for this - but doesn't $pdf->stringify just return the raw PDF data, so I've still got to parse it? Apologies if I'm being dim and missing the point here.

      In other words, doesn't this code have the same effect? :

      my $string; open(PDF,"<existing.pdf"); binmode PDF; while (<PDF>) { $string .= $_; } close(PDF);

        You are quite correct. I'm going to go get coffee, as I am obviously still asleep. My googling finds this as an actually intelligent discussion. Sadly, no one appears to have picked this task up since '99.

        OTOH, PDF::Reuse seems to address exactly the issue of bulding up PDFs from a template, so you might look there. OTOOH, what's wrong with using plain text the whole way through?


        If anyone needs me I'll be in the Angry Dome.
Re: Pointers on an XML mail merge solution?
by benrwebb (Scribe) on May 12, 2004 at 15:43 UTC
    Have you considered rendering an .rtf document? You said that you wanted some of the less technical users to be able to ammend the letter/report before it was sent, this seems like the way to go. Then they could just bring it up in Word like any other document.

    If you check out CPAN, you'll find a number of modules to create them.
      Ooh! I hadn't thought of RTF. Thanks! (Scurries off to CPAN ...)