in reply to How to do in "PDF::Extract"?

This could be a side effect of Acrobat doing some "auto repair" upon reading in the file  (also see Unexpected & Unnecessary Popup "do you want to save changes to 'x' before closing ?").  What is triggering the auto repair has yet to be found out, though...

If you accept to save changes and then compare the newly saved version with the original one, are there any differences?  (use a tool to compare binary files, such as "cmp -bl ..." on Unix)

Update: for testing purposes, you could try to (dummy-)process the PDF file created via PDF::Extract using some other tool, e.g. the already mentioned "pdf tool kit":

$ pdftk original.pdf output fixed.pdf

This would fix anything that pdftk considers worth fixing...  Then check whether Acrobat is still exhibiting the undesired behavior when using this fixed.pdf.

BTW, you could also use pdftk directly to extract all pages:

$ pdftk sample.pdf burst

As hda said, an excellent tool — highly recommended!

Replies are listed 'Best First'.
Re^2: How to do in "PDF::Extract"? (Acrobat auto repairing?)
by gone2015 (Deacon) on Dec 19, 2008 at 14:23 UTC
    This could be a side effect of Acrobat doing some "auto repair" upon reading in the file...

    Having tried PDF::Extract on some randomly selected pdf files, I find that its output is badly broken -- in particular, the 'xref' table is a mess.

    When Adobe Acrobat (Professional 8.1.2) read the output of PDF::Extract (which it wasn't always prepared to do) it would offer to save changes on close -- and the result was a cleaned up file.