in reply to Re: Mechanize and "Not implemented"
in thread Mechanize and "Not implemented"

I was working at a different site yesterday, so this is the earliest I could respond.

I didn't know about LWP::Debug, so thanks for adding to my enlightenment. If I have somehow conveyed the impression that I am generally knowledgeable about LWP, it was unintentional. My knowledge is pretty basic, and I'm trying to keep my projects pretty basic, too, for now.

Anyway, I've turned on the debug option, and changed the script to dump $a. The site is an intranet site, so you won't be able to run the code. I had the "whatever.com" in there as a too-oblique indicator of that.

Code is now:

#!perl # Automated navigation through web pages use strict; use warnings; use WWW::Mechanize; use LWP::Debug '+'; use Data::Dumper; my $a = WWW::Mechanize->new( autocheck => 1, agent => 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)' ); my $testpage = 'http://myprodportalv2.unocal.com:7778/portal/page?_pag +eid=33,30917,33_30962&_dad=puno2o&_schema=PORTAL_PUNO2O'; $a->get($testpage); $a->success() or do { open PAGE, '>failpage.html'; print PAGE Dumper($a); close PAGE; die 'Get failed: "'.$a->res->status_line."\" for\n".$testpage."\n +or ".$a->base()."\n"; };
Output is now:
C:\Documents and Settings\johnsro\My Documents\Perl>perl myfinint2.pl LWP::UserAgent::new: () LWP::UserAgent::request: () HTTP::Cookies::add_cookie_header: Checking myprodportalv2.unocal.com f +or cookies HTTP::Cookies::add_cookie_header: Checking .unocal.com for cookies HTTP::Cookies::add_cookie_header: Checking unocal.com for cookies HTTP::Cookies::add_cookie_header: Checking .com for cookies LWP::UserAgent::send_request: GET http://myprodportalv2.unocal.com:777 +8/portal/page?_pageid=33,30917,33_30962&_dad=puno2o&_schema=PORTAL_PU +NO2O LWP::UserAgent::_need_proxy: Not proxied LWP::Protocol::http::request: () LWP::UserAgent::request: Simple response: Not Implemented Get failed: "501 Not Implemented" for http://myprodportalv2.unocal.com:7778/portal/page?_pageid=33,30917,33_ +30962&_dad=puno2o&_schema=PORTAL_PUNO2O or http://myprodportalv2.unocal.com:7778/portal/page?_pageid=33,30917 +,33_30962&_dad=puno2o&_schema=PORTAL_PUNO2O
and the dumped file is
$VAR1 = bless( { 'req' => bless( { '_content' => '', '_uri' => bless( do{\(my $o = 'http +://myprodportalv2.unocal.com:7778/portal/page?_pageid=33,30917,33_309 +62&_dad=puno2o&_schema=PORTAL_PUNO2O')}, 'URI::http' ), '_headers' => bless( { 'user-agent' + => 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)' }, 'HTTP::Head +ers' ), '_method' => 'GET' }, 'HTTP::Request' ), 'status' => '501', 'content' => '', 'ct' => 'text/plain', 'res' => bless( { '_protocol' => 'HTTP/1.1', '_content' => '', '_rc' => 501, '_headers' => bless( { 'connection' + => 'Close', 'client-resp +onse-num' => 1, 'cache-contr +ol' => 'private', 'date' => 'W +ed, 19 Nov 2003 14:45:01 GMT', 'client-peer +' => '141.146.165.174:7778', 'content-len +gth' => '0', 'client-date +' => 'Wed, 19 Nov 2003 14:45:02 GMT', 'content-typ +e' => 'text/plain', 'server' => +'Oracle9iAS/9.0.2.2.0 Oracle HTTP Server Oracle9iAS-Web-Cache/9.0.2.2 +.0 (N)' }, 'HTTP::Head +ers' ), '_msg' => 'Not Implemented', '_request' => $VAR1->{'req'} }, 'HTTP::Response' ), 'page_stack' => [], 'redirected_uri' => $VAR1->{'req'}{'_uri'}, 'requests_redirectable' => [ 'GET', 'HEAD', 'POST' ], 'from' => undef, 'timeout' => 180, 'parse_head' => 1, 'base' => bless( do{\(my $o = 'http://myprodportalv2. +unocal.com:7778/portal/page?_pageid=33,30917,33_30962&_dad=puno2o&_sc +hema=PORTAL_PUNO2O')}, 'URI::http' ), 'quiet' => 0, 'protocols_forbidden' => undef, 'no_proxy' => [], 'protocols_allowed' => undef, 'use_eval' => 1, 'agent' => 'Mozilla/4.0 (compatible; MSIE 6.0; Window +s NT 5.1)', 'cookie_jar' => bless( { 'COOKIES' => {} }, 'HTTP::Cookies' ), 'proxy' => {}, 'max_size' => undef }, 'WWW::Mechanize' );

Replies are listed 'Best First'.
Re: Re:* Mechanize and "Not implemented"
by Corion (Patriarch) on Nov 19, 2003 at 15:18 UTC

    How did you arrive on that page? I'm asking that question, as my guess is that this page has been originally generated from a POST request and the script dosen't know how to handle a GET request.

    To see whether the script can handle ordinary GET requests, simply paste the url into the browser adress bar or use the GET command installed with LWP.

    If all of this proves fruitless, there'll be no recourse to log the access with your normal browser and compare it to the access with LWP and/or Mechanize, and work out the differences.

    perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web
Re: Re:* Mechanize and "Not implemented"
by PodMaster (Abbot) on Nov 19, 2003 at 15:27 UTC
    Hmm, well, whatever the problem is, it's with your server (that's as much as you can do from perl). Try a passing all those form parameters via a POST request. Get some packet capturing software, and capture a session from a browser (which apparently works?) and capture a mechanize sessions and compare (your finicky server is probably expecting some headers your request doesn't have).

    MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
    I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
    ** The third rule of perl club is a statement of fact: pod is sexy.

      It turns out that requests using HTTP/1.0 can generate a 501 response from the server. I've gone through several module docs (HTTP::Request, HTTP::Headers, LWP itself), but I haven't found this particular issue addressed.

      What would I need to set in the UserAgent object to have it use HTTP/1.1 protocol?

      Update: my question appears to be answered here, and it isn't the protocol that is the problem for me.

        update:I'm of course talking about the latest and greatest (libwww-perl/5.75)

        Hmm, by default LWP::UserAgent should be sending a HTTP/1.1 request. LWP::UserAgent contains

        if ($ENV{PERL_LWP_USE_HTTP_10}) { require LWP::Protocol::http10; LWP::Protocol::implementor('http', 'LWP::Protocol::http10'); eval { require LWP::Protocol::https10; LWP::Protocol::implementor('https', 'LWP::Protocol::https10'); }; }
        LWP::Protocol::http10 sends HTTP/1.0, where as LWP::Protocol::http sends HTTP/1.1. Net::HTTP is what's used underneath the hood, and describes this and more (see http_version, peer_http_version). Here's a little demo
        use strict; use warnings; use LWP; use LWP::Debug '+'; use Data::Dumper; my $ua = LWP::UserAgent->new(); $ua->get(shift || 'http://localhost'); # because LWP I<use>s it at runtime use LWP::Protocol::http; package LWP::Protocol::http::Socket; sub format_request { my $self = shift; my $ret = $self->SUPER::format_request(@_); LWP::Debug::trace($ret); return $ret; } __END__ LWP::UserAgent::new: () LWP::UserAgent::request: () LWP::UserAgent::send_request: GET http://localhost LWP::UserAgent::_need_proxy: Not proxied LWP::Protocol::http::request: () LWP::Protocol::http::Socket::format_request: GET / HTTP/1.1 TE: deflate,gzip;q=0.3 Connection: TE, close Host: localhost User-Agent: libwww-perl/5.75 LWP::Protocol::collect: read 640 bytes LWP::Protocol::collect: read 854 bytes LWP::UserAgent::request: Simple response: OK

        MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
        I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
        ** The third rule of perl club is a statement of fact: pod is sexy.