comment on

First of all, I'm sorry if everything is not clear: English is not my native language.

If you want the full story, here it is (you can skip that part :). We (my company) have a specific OCR software to handle PDF bills. I made a Perl script that extract pdf files from emails or retrieve them from our MFT software, rename them for normalization. As I learned that our OCR software doesn't work well with big pdf, I tried to add some code to the script to check if a file is more than 100 pages and, in that case, keeps only the first 100 pages (and drop the rest).

As I didn't want to bother you with all the details, I only keep the part that cut the pdf in my first post.

To resume: for any pdf, I need to keep at most the 100 first pages (if the pdf is 15 pages, I leave it untouched ; if it's 654 pages, I create a new pdf with the pages 1 to 100 included).

---- End of the story ----

Once again, my script is working (99.9% of the time): my problem is not how to write it but why did it fails for one (only one) pdf and what can I do (if I can do something)!

I didn't try the script against the "Modern Perl" file because unfortunately, I don't have it (yet), but I have lot of 100+ pages pdf (up to 600 pages) and they are all (but one) correctly processed by my script.

I would like to provide you this specific pdf which has probably something that prevents PDF::API2 to process it correctly but I cannot as it contains customers information (I'm looking for a way to obfuscate the content).

What's strange is that I managed to extract 100 pages from this specific pdf using sejda or CAM::PDF and the extractPages method.

But with PDF::API2, it's not working.

In reply to Re^6: blank pdf generated using PDF::API2 (Updated) by lennelei
in thread blank pdf generated using PDF::API2 by lennelei

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.