jasonjohn has asked for the wisdom of the Perl Monks concerning the following question:

What information can I find out about a JPEG given only its URL using Perl?

Can I determine its modify date? File Size?

I notice that most browsers can determine the size of files being downloaded. Do i need to begin a download in order to get this information?

If I were to download the first 500bytes (or so) of the JPEG, what info would this tell me about the image?

I guess my main concern is "has this image changed?" without actually downloading an image referenced by HTTP. If i already "know" the expected file size, or modify date, or details in the jpeg header, what's the minimum that i need to do (/download) to answer "has this image changed?"

thanks :)

  • Comment on Getting JPEG details without downloading

Replies are listed 'Best First'.
Re: Getting JPEG details without downloading
by ikegami (Patriarch) on Sep 29, 2004 at 23:57 UTC

    If you do a HEAD request, the server will most likely return the file size and the file's last-modified time. You can also do a conditional GET, by providing a "If-Modified-Since" header. In both cases, see the spec for details.

    $ GET -m HEAD 'http://www.adaelis.com/misc/temp/monks.jpg' 200 OK Connection: close Date: Thu, 30 Sep 2004 00:04:47 GMT Accept-Ranges: bytes ETag: "3cb18d-f292-411ee589" Server: Apache/1.3.31 (Unix) mod_fastcgi/2.4.2 Content-Length: 62098 <------- Content-Type: image/jpeg Last-Modified: Sun, 15 Aug 2004 04:24:41 GMT <------- Client-Date: Thu, 30 Sep 2004 00:04:47 GMT Client-Peer: 63.111.28.139:80 Client-Response-Num: 1

    As for your second question, the minimum size required to get some basic info, why don't you take a peek into Image::Info's docs and guts?

Re: Getting JPEG details without downloading
by tachyon (Chancellor) on Sep 30, 2004 at 01:19 UTC

    You only need the last modified date in the header to see if the image has *changed*. Of course you can just touch the file to change the last modified date in which case the image is the same and only the M time has changed. The top of image files contains header info about the image. For example the size info appears near the top. With GIFs it is always 6 bytes in. With PNGs it is 16 bytes in after the IHDR token. JPEGs are less convenient. For example the size info is a variable distance in with 163,194,612,746 bytes being common offsets. The point being 500 bytes is not enough of a JPEG. You should use Image::Info which can extract lots of different features. It will barf on partial == corrupt images so you will probably need to hack it a bit.

    cheers

    tachyon