All fine and dandy when you're dealing with normal stuff. The problem is that I need to check if it's a redirect (301 or a 302 response). Robots NEVER returns a 301 or a 302, always a 200 (Success), even on redirected pages: ie i have a page locally which redirects to google:'invoke-after-get' => sub { my($robot, $hook, $url, $response) = @_; if (DEBUG) { print "ORIG_URL: $url\n"; print "URL: "; for($response->header_field_names) { print +"$_\n"; } print "\n"; print "RESPONSE: ". $response->code ."\n"; print "\n"; }
Yet when i do it with Robots:C:\Documents and Settings\gecko\Desktop>nc localhost 80 GET /cgi-bin/redirect.pl HTTP/1.1 host:localhost HTTP/1.1 302 Moved Date: Sun, 17 Jun 2007 02:34:36 GMT Server: Apache/2.2.4 (Win32) Location: http://www.google.com Content-Length: 0 Content-Type: text/plain
As you can see on the robots one, it doesnt even have a "Location" field, so it seems to be automatically following it, even though the hook is defined as this:ORIG_URL: http://127.0.0.1/cgi-bin/redirect.pl URL: Cache-Control Date Server Content-Type Client-Date Client-Peer Client-Response-Num Client-Transfer-Encoding Set-Cookie Title RESPONSE: 200 URL: http://www.google.com/ ORIG_URL: http://127.0.0.1/cgi-bin/redirect.pl RESPONSE: 200 SIZE: 5799 TITLE: Google
how do you recommend i detect a 301/302 in this case? Thanks monks!invoke-after-get This hook function is invoked immediately after the robot makes each GET request. This means your hook function will see every type of response, not just successful GETs.
In reply to WWW::Robots problem by gecko
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |