jonjacobmoon has asked for the wisdom of the Perl Monks concerning the following question:

I apologize for posting so much lately with what seems (to me at least) basic questions, but the docs for HTTP::Request::Commons and HTTP::Cookies are not exactly clear to me.

Nonetheless, I seem to be doing what I want to do according to how others have done it, yet something is amiss. I have set up a cookie_jar and it prints out when I print it as a string, but when I do the request, it does not make it across and I get a 404. Why is the cookie not being sent as described in about 100 documents all over the web.

#!/usr/local/bin/perl -w use strict; use LWP::UserAgent; use HTTP::Cookies; use HTTP::Request::Common qw(POST GET);; use HTTP::Headers; use HTML::Parser; use URI; my $cururi; my $url; my @urls; my $sessionid; my $browser = LWP::UserAgent->new; my $email = 'xxxxxxxx'; my $password = 'xxxxxxx'; my $req; my $res; $browser->cookie_jar(HTTP::Cookies->new(file => 'cookie_jar', autosave + => 1)); my $initurl = "http://www.amazon.com/exec/obidos/flex-sign-in/ref=pd_n +fy_gw_si/"; &browserEmulation($initurl); open(CJ,"cookie_jar"); while (<CJ>) { chomp; if ($_ =~ /session-id\=\"(\d*-\d*-\d*)\"/) { $sessionid = $1; } } close(CJ); $url = "http://www.amazon.com/exec/obidos/flex-sign-in-done/$sessionid +"; $res = $browser->simple_request(POST "$url", { 'email' => $email, 'action' => 'sign-in checked', 'next-page' => 'recs/instant-recs-sign-in-standard.htm +l', 'password' => $password, 'method' => 'get', 'opt' => 'oa', 'page' => 'recs/instant-recs-register-standard.ht +ml', 'response' => 'tg/stores/static/-/goldbox/index/', }); while ($res->is_redirect) { my $u = $res->header('location') or die "missing location: ", $res +->as_string; print "redirecting to $u\n"; $res = $browser->simple_request(GET $u); } if ($res->is_success) { print $res->as_string; } else { my $cururl = $res->base->as_string; print "Error: " . $res->status_line . " $cururl\n"; print $res->as_string; } sub browserEmulation { my $starturl = shift || die "No url supplied\n"; my $baseuri = URI->new($starturl); push @urls,$starturl; my $parser = HTML::Parser->new(api_version => 3, start_h => [\&start ,"tagname, attr"]); my $page; $browser->agent("Jonzilla/666"); while( $url = shift @urls) { my $request = new HTTP::Request 'GET' => $url; my $result = $browser->request($request); # shortens Tim's correction, but he gets credit for finding the bu +g $cururi = $result->base->as_string; if ($result->is_success) { $page .= $result->as_string; $parser->parse($result->content); } else { $page .= "Error: " . $result->status_line . " URL=$url, $baseu +ri\n"; if ($result->as_string ne "") { $page .= $result->as_string; } } return $page; } sub start { my($tag,$attr) = @_; if ($tag eq 'frame' ) { my $thisuri = URI->new($attr->{src}); push @urls, $thisuri->abs($cururi); } } } sub scan_cj { if ($_[1] eq 'session-id') { $sessionid = $_[2]; } }


I admit it, I am Paco.

Edited by footpad, ~ Sun Sep 29 03:15:30 2002 (UTC) : Added <readmore> tag, per Consideration

Replies are listed 'Best First'.
Re: Cookie Not Being Set
by blm (Hermit) on Sep 28, 2002 at 18:34 UTC

    Maybe I am missing something but the $sessionid part of the URL doesn't seem to come from a cookie. It seems to be autogenerated on arrival at www.amazon.com. The 404 might be generated by using an unacceptable sessionid. I looked at my amazon.com cookie and the value was different to the sessionid in my urls.

    From where I stand it appears like the following is the best approach to take is to go in through the front door at http://www.amazon.com/, get redircted to http://www.amazon.com/exec/obidos/subst/home/home.html/xxx-xxxxxxx-xxxxxxx (where x's are a valid sessionid) and use HTML::Parser and LWP::UserAgent to follow the links through to where you want to go. I know it is a bit of work but I don't see a way around it.

    --blm-- If you don't like this post can you please /msg me
      Been done and tried.

      It may not come from a cookie, but it is in the cookie and must match the cookie. I get a session-id and append it to the URL so I can have a valid Amazon URL. That is not the problem. The problem, as I see it, it that the cookie is not in the result object even though it appears to be in the request object. Where did it go?


      I admit it, I am Paco.
        the cookie is not in the result object

        Why would you expect it to be? The server isn't required to send the cookie (or even a SetCookie header) with each response. (This is a nit, but "response" is a better term to use than result.)

        Where did it go?

        You saved it in your cookie jar. That's where it is. Your browser will send the cookie along with subsequent requests so long as it hasn't expired. That's why you see it in the request object.

        -sauoq
        "My two cents aren't worth a dime.";
        
        The problem, as I see it, it that the cookie is not in the result object even though it appears to be in the request object.

        You say that the cookie "appears to be" in the request object. Have you verified that it is? If so, how?