brian123 has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,
I have spent the last two days trying to figure out why my short Perl script does not retrieve the text between <body> </body> of a website but got no luck. If you have time, please take a look and let me know why please. I am a beginner in Perl and HTML as well so take it easy on me please.

#!/usr/bin/perl -w use LWP::UserAgent; use HTTP::Request::Common; # my $ua = LWP::UserAgent->new; $ua = new LWP::UserAgent(keep_alive=>1,agent=>'Mozilla/5.0 (Windows; U +; Windows NT 5.1; en-US; rv:1.9.2.11) Gecko/20101012 Firefox/3.6.11'); # Create a request my $response = $ua->post('http://www.sanmateocountytaxcollector.org/SM +CWPS/SearchParcels',{parcelNumber=>'048022230',searchType=>'parcel',l +istFirst=>'S',bkList=>'SS',addressTypeDsp=>'',addressType=>'',actionT +ype=>'search',nextPage=>'./pages/parcelList.jsp',parcel1=>'048',parce +l2=>'022',parcel3=>'230'}); my $u = $response->header('location') or die "missing location: ", $r +esponse->as_string; print "redirecting to $u\n"; # $r = $ua->get($u); $r = $ua->simple_request(GET $u); print $r->as_string;
Thank you for your help.

Replies are listed 'Best First'.
Re: Help With Post-Redirect-Get Problem
by ikegami (Patriarch) on Dec 09, 2010 at 05:21 UTC

    You're being redirected

    HTTP/1.1 302 Found ... Location: http://www.sanmateocountytaxcollector.org/SMCWPS/Home?goTo=s +ecure ...

    but you used simple_request instead of ->request so you're not performing the redirection.

    If you switch to ->request, you end up in an infinite redirection loop.

    HTTP/1.1 302 Found ... Location: http://www.sanmateocountytaxcollector.org/SMCWPS/pages/secur +eSearch.jsp ... Client-Response-Num: 9 Client-Warning: Redirect loop detected (max_redirect = 7) ...

    It's probably cause it's trying to set cookies and they're not being handled.

    Set-Cookie: JSESSIONID=0000WK9hTtPjeyIqUyVwqxfrxIQ:-1;Path=/

    If so, giving LWP::UserAgent a cookie jar will do the trick. Using WWW::Mechanize would also do the trick.

      hmmm, what does all this mean? Do I just change the
      $r = $ua->simple_request(GET $u); to $r = $ua->request(GET $u);
      Please provide the code if you would since I am a little new with Perl.
      Thanks.

        That's the first step. After that, you end up in an infinite redirection loop.

        HTTP/1.1 302 Found ... Location: http://www.sanmateocountytaxcollector.org/SMCWPS/pages/secur +eSearch.jsp ... Client-Response-Num: 9 Client-Warning: Redirect loop detected (max_redirect = 7) ...

        It's probably cause it's trying to set cookies and they're not being handled.

        Set-Cookie: JSESSIONID=0000WK9hTtPjeyIqUyVwqxfrxIQ:-1;Path=/

        If so, giving LWP::UserAgent a cookie jar will do the trick. Using WWW::Mechanize would also do the trick.