Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Finding the URL of this image

by sulfericacid (Deacon)
on May 16, 2007 at 23:13 UTC ( [id://615900]=perlquestion: print w/replies, xml ) Need Help??

sulfericacid has asked for the wisdom of the Perl Monks concerning the following question:

I posted a SOPW earlier without a good example, so here's a second shot. I'm using WWW::Mechanize to grab URLs of images from a particular web site. The problem is, once in a while the server seems to redirect the image URL to another location with an IP address.

I need to find a way to determine what the FINAL url of the image is, so one way or another I need to know if it redirects or will load the first URL.

A good example is this link (it is safe for work, just a Spiderman picture) http://images.imagefap.com/images/full/31/459/459526477.jpg. You'll notice it makes your browser go to http://85.17.40.9:99/31a/full/459/459526477.jpg instead. It's the same URL after the domain name/IP address.

How can I get my script to load the IP address if it pops up?



"Age is nothing more than an inaccurate number bestowed upon us at birth as just another means for others to judge and classify us"

sulfericacid

Replies are listed 'Best First'.
Re: Finding the URL of this image
by superfrink (Curate) on May 17, 2007 at 00:03 UTC
    When you connect to the first URL the web server is sending back a few headers. The header indicating the redirection is the "Location". I ran
    wget -O /dev/null -S 'http://images.imagefap.com/images/full/31/459/45 +9526477.jpg'
    And note this part.
    HTTP/1.0 302 Found Connection: keep-alive Location: http://85.17.40.9:99/31a/full/459/459526477.jpg Content-Length: 0 Date: Thu, 17 May 2007 05:11:56 GMT Server: lighttpd/1.4.11
    With regards to how you can see that with WWW::Mechanize I don't know but the docs read
    WWW::Mechanize is a proper subclass of LWP::UserAgent and you can also + use any of LWP::UserAgent's methods.
    The docs for LWP:UserAgent indicate a HTTP::Response object is available and it's header() function might be what you are looking for.
Re: Finding the URL of this image
by eric256 (Parson) on May 17, 2007 at 00:03 UTC

    Warning: Investigating the url (imagefap.com) provided is NSFW. I would have preferred if that was mentioned in the message somewhere.

    Besides that I'm not sure how comfortable I am helping someone who is apparently trying to rip off images (of whatever nature I don't care) from a site. Perhaps you have something to say that would put your request in a less suspicious light?


    ___________
    Eric Hodges
      Wow, it's been a while since I've been here and missed your post.

      The page linked is SFW, should go click elsewhere it may or may not be.

      As part of your other comment, who hasn't created an image bot before? I remember Merlyn has, among many other honorable mentions. And not one image was taken from their web site, everything is hotlinked and is per their TOS.



      "Age is nothing more than an inaccurate number bestowed upon us at birth as just another means for others to judge and classify us"

      sulfericacid

        I couldn't read their TOS without it becoming NSFW, in addition following a picture sharing websites TOS doesn't mean that using the picture is legal in anyway. Either way I just asked for clarification and warned others that it is certainly not safe for work if you do any investigating of it at all.


        ___________
        Eric Hodges

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://615900]
Approved by Moriarty
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (3)
As of 2024-03-29 06:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found