Working with PDFs

xorl has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Working with PDFs by talexb (Chancellor) on Jan 11, 2002 at 19:45 UTC
I have been using the PDFLib module (see this web site) to generated PDFs on the fly -- it works great, though it is commercial software. They also sell something called PDI (PDF Import library) which will let you open an existing PDF and retrieve particular page numbers. (The client didn't purchase PDI; I'm just going by the documentation.) From there, it's probably five minutes work to write a small utility that opens an input document, retrieves a particular range of pages and writes them to an output document. --t. alex "Excellent. Release the hounds." -- Monty Burns.	[reply]
Re: Re: Working with PDFs by George_Sherston (Vicar) on Jan 11, 2002 at 23:01 UTC
Worth noting that PDFLib is available under the "Aladdin Public Licence" which lets you use it free for non-commercial projects. § George Sherston	[reply]
Re: Working with PDFs by Masem (Monsignor) on Jan 11, 2002 at 21:40 UTC
You may want to consider using Ghostscript directly, which can read both postscript and PDF files, but can also output them as well. You can also limit which pages you want to output, thus effectively 'deleting' pages from the PDF. With ghostscript, then, you can distiall the ps to PDF and thus get back your original file. You can call ghostscript directly from perl, if you need it in such a format. Ghostscript is free and available for most common platforms. ----------------------------------------------------- Dr. Michael K. Neylon - mneylon-pm@masemware.com \|\| "You've left the lens cap of your mind on again, Pinky" - The Brain "I can see my house from here!" It's not what you know, but knowing how to find it if you don't know that's important	[reply]
Re: Working with PDFs by Anonymous Monk on Jan 11, 2002 at 21:28 UTC
Maybe not a good answer, but here's how I've approached the problem in the past: With Adobe Acrobat 5, you can print to file the .pdf, so it's saved off as a Postscript file. I have perl scripts which parse through (scanning page headers and footers), and pluck out the ones I care about. I guess it really depends on how comfortable you are with Postscript. So you run your script against the .ps, generate a new .ps, and with Adobe Distiller you can convert that back to a .pdf. A longer process, but at least your input file is all ASCII, and you can unleash Perl's capabilities on it.	[reply]


Come for the quick hacks, stay for the epiphanies.
	PerlMonks