Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Working with PDFs

by xorl (Deacon)
on Jan 11, 2002 at 19:37 UTC ( [id://138027]=perlquestion: print w/replies, xml ) Need Help??

xorl has asked for the wisdom of the Perl Monks concerning the following question:

I want to take a PDF and delete several pages from it. I've found a few things on CPAN for working with PDFs but they all seem buggy or have so little documentation that I can't tell if it's a bug or a feature. Plus none of them actually seems to do what I want. Does anyone have any idea how to do this? Thanks

Replies are listed 'Best First'.
Re: Working with PDFs
by talexb (Chancellor) on Jan 11, 2002 at 19:45 UTC
    I have been using the PDFLib module (see this web site) to generated PDFs on the fly -- it works great, though it is commercial software. They also sell something called PDI (PDF Import library) which will let you open an existing PDF and retrieve particular page numbers. (The client didn't purchase PDI; I'm just going by the documentation.)

    From there, it's probably five minutes work to write a small utility that opens an input document, retrieves a particular range of pages and writes them to an output document.

    --t. alex

    "Excellent. Release the hounds." -- Monty Burns.

      Worth noting that PDFLib is available under the "Aladdin Public Licence" which lets you use it free for non-commercial projects.

      § George Sherston
Re: Working with PDFs
by Masem (Monsignor) on Jan 11, 2002 at 21:40 UTC
    You may want to consider using Ghostscript directly, which can read both postscript and PDF files, but can also output them as well. You can also limit which pages you want to output, thus effectively 'deleting' pages from the PDF. With ghostscript, then, you can distiall the ps to PDF and thus get back your original file. You can call ghostscript directly from perl, if you need it in such a format. Ghostscript is free and available for most common platforms.

    -----------------------------------------------------
    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
    "I can see my house from here!"
    It's not what you know, but knowing how to find it if you don't know that's important

Re: Working with PDFs
by Anonymous Monk on Jan 11, 2002 at 21:28 UTC
    Maybe not a good answer, but here's how I've approached the problem in the past: With Adobe Acrobat 5, you can print to file the .pdf, so it's saved off as a Postscript file.

    I have perl scripts which parse through (scanning page headers and footers), and pluck out the ones I care about. I guess it really depends on how comfortable you are with Postscript.

    So you run your script against the .ps, generate a new .ps, and with Adobe Distiller you can convert that back to a .pdf. A longer process, but at least your input file is all ASCII, and you can unleash Perl's capabilities on it.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://138027]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2024-04-19 04:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found