Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I've been thinking the past day or so of a way to use LWP::Simple to download an image when direct access to the URL is denied by the server. Now when you go there in the browser, it stores in the temporary directory of the browser.

If direct access to a URL isn't allowed, how would it be possible to get the image? There must be a way to store it into memory when lwp loads it? Actually when I think about it, that doesn't make any sense.

But is it possible?

Replies are listed 'Best First'.
Re: getstore protected image
by zentara (Cardinal) on Mar 31, 2006 at 22:47 UTC
    Your question dosn't make sense. If you can get to it with a browser, so it can store it in it's cache, you can read it out of the cache, or save the image. If the browser can get to it, but not lwp, it maybe that the page is using some sort of filter, to filter for browser names. Try to fake out the server:
    # setup your browser $ua = LWP::UserAgent->new(keep_alive => 1, timeout => 300); # what kind of browser you are $ua->agent("Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3a) Gecko/200 +21207 Phoenix/0.5");
    If that don't work, there may be some javascript magic going on, and you may need WWW::Mechanize. Try to capture the actual transfer with ethereal or tcpick, to actually see what they are doing.

    I'm not really a human, but I play one on earth. flash japh
      Actually what I mean is I can access the page that SHOWS the image in my browser just fine but I cannot type in the URL of the image directly to access the image. It results in a FORBIDDEN error.

      I can load the page that loads the image but I can't load the image directly. What I need is a way to load the image anyway.

        Well it's hard to say, when you don't give the url. Is it a cgi script? A cgi script can deliver a page and image, that is custom made on demand. Not all web pages are static html.

        I'm not really a human, but I play one on earth. flash japh
Re: getstore protected image
by jpeg (Chaplain) on Mar 31, 2006 at 23:17 UTC
    Can you clarify what you mean by 'direct access'?

    If you're talking about direct access to the URL of the image itself being blocked - as in, blocked in the browser when you're trying to get the image independent of the page - that makes me suspect the server is checking HTTP_REFERER and blocking requests on that criteria. You could use LWP to get a page, build a list of images, and send a new series of GET requests after populating the referer header of the HTTP::Request object with the url of the original page.

    HTH.

    --
    jpg
      I mean I can't copy and paste the URL of the image and get it.

      Are you saying I could get the image off the page and toss in a fake HTTP_REFERER while I call another get on the image and it might fake it?

      Any example how to do that?

        I mean:

        - GET the original html page containing img links
        - build an array of URLs for the images
        - for each image URL, use LWP to GET the image. In this step, you'll need to change the referer to the URL of the original page.

        See perldoc LWP and perldoc HTTP::Headers. When you use LWP you create an LWP::UserAgent object, and you can pass an HTTP::Headers object to the LWP::UserAgent's get() method. Populate the HTTP::Headers object with... headers, including 'referer' => $originalURL.

        --
        jpg