Baz has asked for the wisdom of the Perl Monks concerning the following question:

I'm attempting to fetch data from a web page using get. The webpage in question allows people to make only 10 views of a page per day, but when I attempt to fetch the page using get, i recieve a message informing me that all my views are used up for today, despite the fact that this was my first attempt at viewing the page. I'm guessing the problem lies with cookies but this problem is out of my league. Can anyone suggest how I might approch solving this problem. I'm guessing I could establish the http transfer between IE6(my browser - which does allow me to view 10 times) and the server,and then replicate this transfer using perl....is it possible to view the http tranfer of a browser?
thanks barry.

Replies are listed 'Best First'.
(jeffa) Re: Fetching web pages using
by jeffa (Bishop) on Jul 31, 2002 at 22:07 UTC
    If the problem is indeed cookies, then you should use a module like HTTP::Cookies to store a 'cookie jar' for you. You might need to log in as well. Here is a script that i use to post comment to the Chatterbox from a terminal. Meditate upon it and see if you can write a similar script to solve your problem. You will probably not want to use Netscape cookies as well - YMWV. If you do use Netscape cookies, you will need to launch Netscape, log in to the site, and exit Netscape to save the proper cookie. Otherwise, read the HTML code for the login form and supply the proper form values in the call to POST().
    use strict; use LWP; use HTTP::Request::Common; use HTTP::Cookies; print "\n: "; chomp ($_ = <>); use constant URL => 'http://www.perlmonks.org/'; my $ua = LWP::UserAgent->new; $ua->agent('chat_poster/1.0 (' . $ua->agent .')'); $ua->cookie_jar(HTTP::Cookies::Netscape->new( file => $ENV{HOME} . '/.netscape/cookies', autosave => 1 )); my $request = POST(URL, Content => [ op => 'message', message_send => 'talk', message => $_, ] ); # this is just to gain back control of the terminal # you will probably want to not fork for your problem exit if fork(); my $response = $ua->request($request);

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    
Re: Fetching web pages using
by Cine (Friar) on Jul 31, 2002 at 22:08 UTC
    something like:
    use LWP::UserAgent; my $ua = LWP::UserAgent->new; $ua->agent('whatever IE6 say'); $ua->cookie_jar(HTTP::Cookies->new(file => "lwpcookies.txt",autosave = +> 1)); my $req = HTTP::Request->new(GET => 'http://...'); my $res = $ua->request($req); if ($res->is_success) { print $res->content; } else { print "Bad luck this time\n"; }


    T I M T O W T D I
Re: Fetching web pages using
by BorgCopyeditor (Friar) on Aug 01, 2002 at 04:21 UTC

    Not a perl solution, but you could look at the cookie (if that's how the page limit is being enforced) and see if there's an obvious way to spoof it (if, for example, the cookie just reads "page_views_so_far=5"). Your browser is just sending something like:

    GET / HTTP/1.0 Cookie: page_views_so_far=5
    and then two newlines. That's all the cookie transaction is. Note: there are probably other headers, too, especially if it's HTTP 1.1 ... you can view all the headers your browser is sending here (all except whatever cookie(s), of course).

    BCE
    --Your punctuation skills are insufficient!