Let's say that hypothetically I'm downloading JPGs from a website.

My script is downloading a series of JPGs:

www.website.com/1/1.jpg www.website.com/1/2.jpg www.website.com/1/3.jpg [...] www.website.com/2/1.jpg www.website.com/2/2.jpg www.website.com/2/3.jpg
and so on.

And let's say that some of those JPGs, despite their existence being implied, might not exist.

When my code gets to a "bad patch", say there's no 42 folder, it goes on downloading

www.website.com/42/1.jpg www.website.com/42/2.jpg www.website.com/42/3.jpg
and saving them to disk as .JPG files, but in reality it's grabbing the 404 message from the website.

My question is, what's the easiest way to check for the case when the URL "www.website.com/42/1.jpg" is sending back an HTML "sorry" page, not a JPG?

I'm just using LWP::Simple at the moment with getstore().

Should I do a HEAD request first? Or just get the URL anyway, check its MIME type, and only store if it's IMG/JPG? Or can I trust that a site will always have a 404 response code for a URL that doesn't exist and go by that? What's most efficient? Will I start using LWP::UserAgent instead of Simple?


In reply to download JPG series with error-handling by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.