First of all, I'm sorry if everything is not clear: English is not my native language.
If you want the full story, here it is (you can skip that part :). We (my company) have a specific OCR software to handle PDF bills. I made a Perl script that extract pdf files from emails or retrieve them from our MFT software, rename them for normalization. As I learned that our OCR software doesn't work well with big pdf, I tried to add some code to the script to check if a file is more than 100 pages and, in that case, keeps only the first 100 pages (and drop the rest).
As I didn't want to bother you with all the details, I only keep the part that cut the pdf in my first post.
To resume: for any pdf, I need to keep at most the 100 first pages (if the pdf is 15 pages, I leave it untouched ; if it's 654 pages, I create a new pdf with the pages 1 to 100 included).
---- End of the story ----
Once again, my script is working (99.9% of the time): my problem is not how to write it but why did it fails for one (only one) pdf and what can I do (if I can do something)!
I didn't try the script against the "Modern Perl" file because unfortunately, I don't have it (yet), but I have lot of 100+ pages pdf (up to 600 pages) and they are all (but one) correctly processed by my script.
I would like to provide you this specific pdf which has probably something that prevents PDF::API2 to process it correctly but I cannot as it contains customers information (I'm looking for a way to obfuscate the content).
What's strange is that I managed to extract 100 pages from this specific pdf using sejda or CAM::PDF and the extractPages method.
But with PDF::API2, it's not working.
In reply to Re^6: blank pdf generated using PDF::API2 (Updated)
by lennelei
in thread blank pdf generated using PDF::API2
by lennelei
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |