Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: Trouble with LWP::UserAgent with certain website

by poolpi (Hermit)
on Jul 29, 2008 at 05:49 UTC ( [id://700736]=note: print w/replies, xml ) Need Help??


in reply to Trouble with LWP::UserAgent with certain website

This website send you a cookie
You can use something like that :

my $ua = LWP::UserAgent->new( agent => 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5 +.1; .NET CLR 1.1.4322)', cookie_jar => HTTP::Cookies->new( file => 'cookies.txt', autosave => 1, ignore_discard => 1 ) );

UPDATE
I tested the script and your problem comes from the ASP.NET_SessionId (see below)

#!/usr/bin/perl -w use strict; use LWP::UserAgent; use LWP::Debug qw(+); my $ua = LWP::UserAgent->new(); my $url = q{http://www.investway.com}; my @headers = ( 'User-Agent' => 'User-Agent=Mozilla/5.0 (X11; U; Linux x86_64; en- +US; rv:1.8.1.14) Gecko/20080404 Iceweasel/2.0.0.14 (Debian-2.0.0.14-2)', 'Accept-Language' => 'Accept-Language=en-us,en;q=0.5', 'Accept-Charset' => 'Accept-Charset=ISO-8859-1,utf-8;q=0.7,*;q=0. +7', 'Accept-Encoding' => 'Accept-Encoding=gzip,deflate', 'Accept' => "image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, +image/png, */*", Cookie => 'ASP.NET_SessionId=l0gfoxzlneisjjfimjng23v1', ); my $res = $ua->get( $url, @headers ); if ( $res->is_success ) { print $res->headers_as_string, "\n"; } else { print $res->status_line . "\n"; Output: LWP::UserAgent::new: () LWP::UserAgent::request: () LWP::UserAgent::send_request: GET http://www.investway.com LWP::Protocol::http::request: () LWP::Protocol::collect: read 552 bytes LWP::Protocol::collect: read 570 bytes LWP::Protocol::collect: read 1368 bytes LWP::Protocol::collect: read 1368 bytes LWP::Protocol::collect: read 1368 bytes LWP::Protocol::collect: read 1368 bytes LWP::Protocol::collect: read 1368 bytes LWP::Protocol::collect: read 1368 bytes LWP::Protocol::collect: read 1368 bytes LWP::Protocol::collect: read 1368 bytes LWP::Protocol::collect: read 1368 bytes LWP::Protocol::collect: read 1368 bytes LWP::Protocol::collect: read 1368 bytes LWP::Protocol::collect: read 1368 bytes LWP::Protocol::collect: read 1368 bytes LWP::Protocol::collect: read 1028 bytes LWP::UserAgent::request: Simple response: OK Cache-Control: private Date: Tue, 29 Jul 2008 09:34:01 GMT Server: Microsoft-IIS/6.0 Content-Length: 19934 Content-Type: text/html; charset=utf-8 Client-Date: Tue, 29 Jul 2008 09:33:51 GMT Client-Peer: 10.154.68.6:8080 Client-Response-Num: 1 Link: <Includes/PropertyDetail.css>; /="/"; rel="stylesheet"; type="te +xt/css" Link: <Includes/thickbox.css>; /="/"; media="screen"; rel="stylesheet" +; type="text/css" Link: <Images/favicon.ico>; /="/"; rel="shortcut icon" Title: Investway X-AspNet-Version: 2.0.50727 X-Powered-By: ASP.NET

hth,
PooLpi

'Ebry haffa hoe hab im tik a bush'. Jamaican proverb

Replies are listed 'Best First'.
Re^2: Trouble with LWP::UserAgent with certain website
by cketcham (Initiate) on Jul 29, 2008 at 16:10 UTC
    I tried your new script and got this result:

    LWP::UserAgent::new: ()
    LWP::UserAgent::request: ()
    LWP::UserAgent::send_request: GET http://www.investway.com
    LWP::UserAgent::_need_proxy: Not proxied
    LWP::Protocol::http::request: ()
    LWP::Protocol::collect: read 160 bytes
    LWP::UserAgent::request: Simple response: Found
    LWP::UserAgent::request: ()
    LWP::UserAgent::send_request: GET http://www.investway.com/ErrorPage.aspx?aspxerrorpath=/Section.aspx
    LWP::UserAgent::_need_proxy: Not proxied
    LWP::Protocol::http::request: ()
    LWP::Protocol::collect: read 784 bytes
    LWP::Protocol::collect: read 2242 bytes
    LWP::UserAgent::request: Simple response: Internal Server Error
    500 Internal Server Error


    I am on a Windows XP platform. Also, I can block cookies in Internet Explorer and the web page still loads correctly in the browser window, so I am wondering why cookies would be significant. Also, could you please explain a little how you figured out cookies were being used by the web page?

      Have you read my update?

      More informations:
      - Check your http headers with a Firefox add-on, Tamper for example.
      - And you may read this article about ASP.NET session.

      Good luck ;)

      hth,
      PooLpi

      'Ebry haffa hoe hab im tik a bush'. Jamaican proverb

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://700736]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2024-03-28 17:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found