Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I wish to create a cgi that will place the values of CGI variables into a premade form (i.e. IRS forms, INS forms, and the like) and save it as a pdf file. What would be the most straightforward way of doing this? Assume a Linux box with a fair amount of usable tools.

Replies are listed 'Best First'.
Re: Dynamic Forms
by toma (Vicar) on May 25, 2001 at 07:50 UTC
    I do this sort of thing with the indespensible ImageMagick tools, which work very nicely on Linux. This method is not clever, but it is straightforward. Start with a Postscript file if it is available. IRS files are available in Postscript. If your don't have a Postscript file, you can convert a pdf file to postscript with ImageMagick, which in turn calls ghostscript but hides the gory details from you:
    convert f1040ez.pdf f1040ez.ps
    Now you have a Postscript text file you can manipulate, so that as a Perl programmer you become invincible :-). At this point you have at least two ways to add your annotations:
    Use Raw Postscript
    It is not hard to add simple text boxes to a Postscript document.
    Use Xfig
    Convert the pdf or Postscript file to fig using ImageMagick. Fig is a vector format used by the drawing program xfig. With xfig add text boxes where you want to put information into your form.
    Inside the text boxes that you add to the form (using either technique) place unique strings of characters, such as "XYZZY1". Write a simple perl program that substitutes your CGI form values for the unique strings.

    Use ImageMagick again to convert the Postscript back to pdf. You can also make the substitutions directly in the pdf file, but you may mess up a character count checksum that will cause Acrobat to display a warning that the pdf file is corrupt (Acrobat will probably work fine, but the warning is annoying).

    It is easiest to use a fixed-width font (such as Courier) in your text boxes. This makes width checking for your form information simple, so you don't overflow.

    The xfig approach probably won't do exactly what you want. You will lose too much information in the round-trip through the translator. So combine this approach with a small amount of Postscript knowledge to translate the text boxes that xfig creates into the raw postscript that you will add to the original postscript document.

    You could also use other tools, such as those from he-whose-name-must-not-be-spoken, to add the text boxes.

    Both Postscript and PDF are fully documented by Adobe.

    The only part you may have trouble with is assembling all the necessary pieces of ImageMagick, but some Linux distributions do a reasonable job of this.

    It should work perfectly the first time! - toma

Re: Dynamic Forms
by blue_cowdawg (Monsignor) on May 25, 2001 at 00:37 UTC

    As far as a library for creating PDF files goes you might want to look around CPAN at the PDF modules there.

    As far as premade forms go, you an use various template modules also available at CPAN. I'm not sure about type-setting actual IRS and INS forms from scratch. That might prove to be daunting.


    Peter L. BergholdSchooner Technology Consulting, Inc.
    Peter@Berghold.Netwww.berghold.net
Re: Dynamic Forms
by HamNRye (Monk) on May 25, 2001 at 02:52 UTC
    Adobe offers the FDF toolkit and SDK from their website. We use it here for processing on-line forms, and it works great. But, you can also set text and variable boxes inside your PDF, and then there is a rather simple OO approach to getting them in the form. Again, this is rather simple with the FDF toolkit. That's my 0.02
Re: Dynamic Forms
by aardvark (Pilgrim) on May 25, 2001 at 03:25 UTC
    I don't know if this is the same thing that HamNRye was talking about, but I've been looking at another Adobe service that converts your documents into PDF. They give you a few free trials but then I think they make you pay. Boo.

    If you are going to be any kind of document foramt conversion, I suggest that you first get your data into an XML format and then transform it using XSL and XSLT.

    There was just a good article on this on xml.com. Don't get put off by all the java talk. You can use Xalan and FOP as command-line tools, and call them from your Perl script.

    Get Strong Together!!

Re: Dynamic Forms
by Dr. Mu (Hermit) on May 25, 2001 at 08:05 UTC
    I did some checking awhile back on this very concept and discovered the following: Most of the text in an Acrobat-created PDF file is compressed. You can't read it in ASCII. However, form entries are readable. If you have a blank PDF form already, manually enter unique patterns into the blanks, e.g. "PDF_Form_LastName", and save the file. In your CGI program, slurp the file in as one big string, replace the patterns with the desired form contents using regexp substitutions, and save the new file or ship it off to the client. One thing not to do, though, is give your patterns variable names and expect interpolation to do anything but make a mess. There are probably plenty of other "variable names" hiding in there you don't want changed to nulls!

    One caveat: Beyond examining the PDF form layout, I haven't actually tried any of this. There may be some hidden gotchas I've overlooked. (Hidden checksums, perhaps?) But it's sure worth a try.
Re: Dynamic Forms
by davis (Vicar) on May 25, 2001 at 12:47 UTC
    I've used pdflib before, and I like it. You can build it as a perl module without difficulty. It's under the Aladdin Free Public License.
    davis
Re: Dynamic Forms
by larryk (Friar) on May 25, 2001 at 00:56 UTC
    I haven't done any PHP work but apparently it's very perl-like. So if you have it installed on your webserver already, it might be worth taking a look at as I believe it has native PDF support.
Re: Dynamic Forms
by shotgunefx (Parson) on May 25, 2001 at 22:59 UTC
    I used PDF::Create for a similar project. It was pretty straightforward.

    -Lee

    "To be civilized is to deny one's nature."