Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
My script is downloading a series of JPGs:
and so on.www.website.com/1/1.jpg www.website.com/1/2.jpg www.website.com/1/3.jpg [...] www.website.com/2/1.jpg www.website.com/2/2.jpg www.website.com/2/3.jpg
And let's say that some of those JPGs, despite their existence being implied, might not exist.
When my code gets to a "bad patch", say there's no 42 folder, it goes on downloading
and saving them to disk as .JPG files, but in reality it's grabbing the 404 message from the website.www.website.com/42/1.jpg www.website.com/42/2.jpg www.website.com/42/3.jpg
My question is, what's the easiest way to check for the case when the URL "www.website.com/42/1.jpg" is sending back an HTML "sorry" page, not a JPG?
I'm just using LWP::Simple at the moment with getstore().
Should I do a HEAD request first? Or just get the URL anyway, check its MIME type, and only store if it's IMG/JPG? Or can I trust that a site will always have a 404 response code for a URL that doesn't exist and go by that? What's most efficient? Will I start using LWP::UserAgent instead of Simple?
|
|---|