in reply to Detecting PDF content
Here is the start of what it dumps about the first page of The Perl Journal:
You can probably find a distinct set of components for your image-only cases.% perl pagedump.pl 0301tpj.pdf 1 Page 1 Dictionary << Name: /CropBox => Array [ Number: 0 Number: 0 Number: 558 Number: 756 ] Name: /MediaBox => Array [ Number: 0 Number: 0 Number: 558 Number: 756 ] Name: /Rotate => Number: 0 Other: Page_Object => Object: 402 0 R Other: Resource_Object => Object: 434 0 R >> ...
Update: Mr. Muskrat and I seem to have different interpretations of your question. I read "detect that" to mean "detect that a file (which is already known to be a pdf file) contains only images rather than images plus text or text alone."
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: Detecting PDF content
by Rich36 (Chaplain) on Jan 23, 2003 at 22:20 UTC |