rpike has asked for the wisdom of the Perl Monks concerning the following question:

Has anyone in here had any experience with inserting JPEGS into a PDF from "scratch" perl code? I have a converter that will take a text file and convert it to a PDF but I'm trying to insert an image as a background. Anyone know how the code needed to insert the image AND does anyone know what content of the image needs to be inserted into the data stream within the PDF? Thanks in advance for any helpful information. Sincerely, RP

Replies are listed 'Best First'.
Re: PDF and Image Insertion
by traveler (Parson) on Nov 27, 2006 at 18:23 UTC
    This sample should so what you want.
    use PDF::API2; use strict; my $new = PDF::API2->new; # new doc my $pdf = PDF::API2->open("testdata.pdf"); # existing foreground doc my $page = $new->page; my $bg = $page->gfx; my $img = $new->image_jpeg ("test.jpg"); #Bg image $bg->image($img, 0, 0, 612, 792); # locate in lower right, 8.5x11 pape +r my $form = $new->importPageIntoForm($pdf,1); #get page 1 from existing + doc $bg->formimage($form,0,0,1); $new->saveas( "try2.pdf");
Re: PDF and Image Insertion
by bart (Canon) on Nov 27, 2006 at 18:30 UTC
    There's a module on CPAN that I rather like, but that has been abandoned: PDF::Create. The last version on CPAN is 0.01, but I found it is on Sourceforge too, and the version there, 0.06.1b, supports insertion of images.
Re: PDF and Image Insertion
by robot_tourist (Hermit) on Nov 27, 2006 at 15:53 UTC

    I must say I know next to nothing about creating PDFs, but have you looked up the PDF modules on CPAN or the actual Adobe PDF documentation? (apparently it's an open protocol)

    How can you feel when you're made of steel? I am made of steel. I am the Robot Tourist.
    Robot Tourist, by Ten Benson

      Some of the people I've talked to, that seem to know CPAN pretty well, have given me the impression it isn't flexible enough to do what I'm trying to get it to do. Also, I've been hunting around trying to find out exactly what should be extracted from the JPEG to use in the PDF and can't find too much there either. I'm working with some information I've found from a small doc on the web that is helping slightly in the extraction of data from the JPEG. Hopefully that will lead to more info and getting the darn thing up and running. I find Adobe's documentation boring and clogged down with crap but I have referenced it a bunch of times.
        Also, I've been hunting around trying to find out exactly what should be extracted from the JPEG to use in the PDF and can't find too much there either.

        Yeah, you're right, Adobe's Reference Documentation isn't exactly clear on what a stream appropriate for the DCTDecode filter (see p. 60) would look like. However, trial-and-error shows that JFIF (JPEG File Interchange Format -- that's what regular jpeg files are) is fine, apparently. IOW, you don't need to extract anything from the jpeg file, just copy the file as is (including the header stuff) into the stream section of your PDF object declaration...

        For example, an image object declaration could look like

        1 0 obj << /Type /XObject /Subtype /Image /Width $WIDTH /Height $HEIGHT /ColorSpace /DeviceRGB /BitsPerComponent 8 /Length $STREAMSIZE_IN_BYTES /Filter /DCTDecode >> stream ... entire jpeg file contents here ... endstream endobj

        (Replace $WIDTH, $HEIGHT, $STREAMSIZE_IN_BYTES with the appropriate values, of course. The ColorSpace and BitsPerComponent settings (as shown) should be fine for most typical color jpeg files)

        Well, I guess, I'll leave it at that for the moment, because I'm not sure at all if that's what you were asking... ;) -- In any case, if you want me to elaborate on this rather low level approach, just say so... (also, I could put up a minimal working example somewhere, containing nothing but the above object plus the absolutely essential boiler plate -- so you can more easily examine the details in context).

        Having said that, I'd like to point out that - as suggested by other monks - a more high-level approach, using CPAN modules, is almost certainly the way to go. Except if you really want to do it yourself from scratch, for the learning experience or whatever.

        Actually, it might make some sense, if you'd like to just modify your existing converter tool to directly output the additional PDF code while creating the PDF in the first place... Manually inserting an image into an existing PDF file is quite a PITA -- mainly because all objects in a PDF file are indexed via some lookup table containing the objects' byte offsets in the file. As soon as you begin to shift around objects (e.g. by inserting a new one), you have to adjust all indices...