I do this sort of thing with the indespensible ImageMagick tools,
which work very nicely on Linux.
This method is not clever, but it is straightforward.
Start with a Postscript file if it is available.
IRS files are available in Postscript. If your don't have
a Postscript file, you can convert a pdf file
to postscript with ImageMagick,
which in turn calls ghostscript but hides the gory details from you:
convert f1040ez.pdf f1040ez.ps
Now you have a Postscript text file you can manipulate,
so that as a Perl programmer you become invincible :-).
At this point you have at least two ways to add your annotations:
- Use Raw Postscript
-
It is not hard to add simple text boxes to a Postscript document.
- Use Xfig
-
Convert the pdf or Postscript file to fig using ImageMagick.
Fig is a vector format used by the drawing program xfig. With
xfig add text boxes where you want to put information into
your form.
Inside the text boxes that you add to the form (using either technique)
place unique strings of characters, such as "XYZZY1". Write
a simple perl program that substitutes your CGI form values for the
unique strings.
Use ImageMagick again to convert the Postscript back to pdf.
You can also make the substitutions directly in the pdf file, but
you may mess up a character count checksum that will cause Acrobat to
display a warning that the pdf file is corrupt (Acrobat will probably
work fine, but the warning is annoying).
It is easiest to use a fixed-width font (such as Courier)
in your text boxes. This makes width checking for your form
information simple, so you don't overflow.
The xfig approach probably won't do exactly what you want.
You will lose too much information in the round-trip through
the translator. So combine this approach with a small amount
of Postscript knowledge to translate the text boxes that xfig
creates into the raw postscript that you will add to the
original postscript document.
You could also use other tools, such as those from
he-whose-name-must-not-be-spoken, to add the text boxes.
Both Postscript and PDF are fully documented by Adobe.
The only part you may have trouble with is assembling all the
necessary pieces of ImageMagick, but some Linux distributions
do a reasonable job of this.
It should work perfectly the first time! - toma |