vitoco has asked for the wisdom of the Perl Monks concerning the following question:

I tried to send this by mail to petdance, but a spam filter refuses to accept my email.

Well, I think I found a bug.

$mech->get($url); for $img ($mech->images) { # $mech->head($img->url_abs()); # $mech->back(); $mech->get($img->url_abs()); $mech->back(); }

What I want to do is to request every image with the page's URL as the referrer. That code works as expected, except when I remove the comment hashes to request the headers first. Then, the URL used by head() is sent by get() as the referrer, even after calling back() before it, which also fails with the first embeded image. Also the same URL is used as referrer by head() for the second image.

I've set up a working example for $url at: <http://www.vitoco.cl/test-ref>

My complete script is at: <http://www.vitoco.cl/test-ref/test-ref.pl>

First, I thought it was an issue about multiple redirections of the images, but discarded this idea using this other test page with direct links: <http://www.vitoco.cl/test-ref/index2.html>

I used a local proxy to capture HTTP headers.

If I'm doing something wrong, please let me know.

Thanks... ++Vitoco

Replies are listed 'Best First'.
Re: Referer and HEAD using WWW::Mechanize
by ikegami (Patriarch) on Aug 14, 2009 at 18:23 UTC

    I ran your code and there was no indication in the output that any errors occurred.

    200 200 Back 1 OK 200 Back 2 OK 200 Back 1 OK 200 Back 2 OK

    I improved your test so we can actually see the problem:

    #!perl use strict; use warnings; use WWW::Mechanize qw( ); my $mech = WWW::Mechanize->new(); #$mech->proxy(['http', 'ftp'], 'http://localhost:8080/'); my $url = 'http://www.vitoco.cl/test-ref'; #my $url = 'http://www.vitoco.cl/test-ref/index2.html'; $mech->get($url)->is_success() or die; @ARGV or die; for my $img ($mech->images) { if ($ARGV[0]) { $mech->head($img->url_abs())->is_success() or die; $mech->back() or die; } my $response = $mech->get($img->url_abs()); print($mech->status(), "\n"); print( ( $response->redirects() )[0]->request()->referer(), "\n"); $mech->back() or die; print("\n"); }
    $ perl a.pl 0 200 http://www.vitoco.cl/test-ref 200 http://www.vitoco.cl/test-ref $ perl a.pl 1 200 http://www.vitoco.cl/test-ref/img1 200 http://www.vitoco.cl/test-ref/img1

    When I upgraded WWW::Mechanize, the back returned false after a head.

    $ perl a.pl 0 200 http://www.vitoco.cl/test-ref 200 http://www.vitoco.cl/test-ref $ perl a.pl 1 Died at a.pl line 22.

    So yes, WWW::Mechanize is buggy.

    if ( $request->method eq 'GET' || $request->method eq 'POST' ) { $self->_push_page_stack(); }

    should be changed to

    $self->_push_page_stack();

    Please file a bug report.

      Please file a bug report.

      Done!