Roy Johnson has asked for the wisdom of the Perl Monks concerning the following question:

I've got a URL that loads fine in a browser, but get() using WWW::Mechanize fails with "501 Not Implemented". What does that suggest?

Code follows.

#!perl # Automated navigation through web pages use strict; use warnings; use WWW::Mechanize; my $a = WWW::Mechanize->new( autocheck => 1, agent => 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)' ); my $testpage = 'http://myprodportalv2.whatever.com:7778/portal/page?_p +ageid=33,30917,33_30962&_dad=puno2o&_schema=PORTAL_PUNO2O'; $a->get($testpage); $a->success() or do { ## This dumps an empty file open PAGE, '>failpage.html'; print PAGE $a->content; close PAGE; die 'Get failed: "'.$a->res->status_line."\" for\n".$testpage."\n +or ".$a->base()."\n"; };

Replies are listed 'Best First'.
Re: Mechanize and "Not implemented"
by PodMaster (Abbot) on Nov 18, 2003 at 00:16 UTC
    What does that suggest?
    A problem ;)
    
    10.5.2 501 Not Implemented
    
       The server does not support the functionality required to fulfill the
       request. This is the appropriate response when the server does not
       recognize the request method and is not capable of supporting it for
       any resource.
    
    Don't forget that WWW::Mechanize is a proper subclass of LWP::UserAgent, so use LWP::Debug qw(+);. Trying to run the code I get
    Error GETing http://myprodportalv2.whatever.com:7778/portal/page?_page +id=33,30917,33_30962&_dad=puno2o&_schema=PORTAL_PUNO2O: Can't connect + to myprodportalv2.whatever.com:7778 (Bad hostname 'myprodportalv2.wh +atever.com') at www.mechanize.501.pl line 16
    so, how about providing a Dumper of $a (as in Data::Dumper).

    MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
    I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
    ** The third rule of perl club is a statement of fact: pod is sexy.

      I was working at a different site yesterday, so this is the earliest I could respond.

      I didn't know about LWP::Debug, so thanks for adding to my enlightenment. If I have somehow conveyed the impression that I am generally knowledgeable about LWP, it was unintentional. My knowledge is pretty basic, and I'm trying to keep my projects pretty basic, too, for now.

      Anyway, I've turned on the debug option, and changed the script to dump $a. The site is an intranet site, so you won't be able to run the code. I had the "whatever.com" in there as a too-oblique indicator of that.

      Code is now:

      #!perl # Automated navigation through web pages use strict; use warnings; use WWW::Mechanize; use LWP::Debug '+'; use Data::Dumper; my $a = WWW::Mechanize->new( autocheck => 1, agent => 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)' ); my $testpage = 'http://myprodportalv2.unocal.com:7778/portal/page?_pag +eid=33,30917,33_30962&_dad=puno2o&_schema=PORTAL_PUNO2O'; $a->get($testpage); $a->success() or do { open PAGE, '>failpage.html'; print PAGE Dumper($a); close PAGE; die 'Get failed: "'.$a->res->status_line."\" for\n".$testpage."\n +or ".$a->base()."\n"; };
      Output is now:
      C:\Documents and Settings\johnsro\My Documents\Perl>perl myfinint2.pl LWP::UserAgent::new: () LWP::UserAgent::request: () HTTP::Cookies::add_cookie_header: Checking myprodportalv2.unocal.com f +or cookies HTTP::Cookies::add_cookie_header: Checking .unocal.com for cookies HTTP::Cookies::add_cookie_header: Checking unocal.com for cookies HTTP::Cookies::add_cookie_header: Checking .com for cookies LWP::UserAgent::send_request: GET http://myprodportalv2.unocal.com:777 +8/portal/page?_pageid=33,30917,33_30962&_dad=puno2o&_schema=PORTAL_PU +NO2O LWP::UserAgent::_need_proxy: Not proxied LWP::Protocol::http::request: () LWP::UserAgent::request: Simple response: Not Implemented Get failed: "501 Not Implemented" for http://myprodportalv2.unocal.com:7778/portal/page?_pageid=33,30917,33_ +30962&_dad=puno2o&_schema=PORTAL_PUNO2O or http://myprodportalv2.unocal.com:7778/portal/page?_pageid=33,30917 +,33_30962&_dad=puno2o&_schema=PORTAL_PUNO2O
      and the dumped file is
      $VAR1 = bless( { 'req' => bless( { '_content' => '', '_uri' => bless( do{\(my $o = 'http +://myprodportalv2.unocal.com:7778/portal/page?_pageid=33,30917,33_309 +62&_dad=puno2o&_schema=PORTAL_PUNO2O')}, 'URI::http' ), '_headers' => bless( { 'user-agent' + => 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)' }, 'HTTP::Head +ers' ), '_method' => 'GET' }, 'HTTP::Request' ), 'status' => '501', 'content' => '', 'ct' => 'text/plain', 'res' => bless( { '_protocol' => 'HTTP/1.1', '_content' => '', '_rc' => 501, '_headers' => bless( { 'connection' + => 'Close', 'client-resp +onse-num' => 1, 'cache-contr +ol' => 'private', 'date' => 'W +ed, 19 Nov 2003 14:45:01 GMT', 'client-peer +' => '141.146.165.174:7778', 'content-len +gth' => '0', 'client-date +' => 'Wed, 19 Nov 2003 14:45:02 GMT', 'content-typ +e' => 'text/plain', 'server' => +'Oracle9iAS/9.0.2.2.0 Oracle HTTP Server Oracle9iAS-Web-Cache/9.0.2.2 +.0 (N)' }, 'HTTP::Head +ers' ), '_msg' => 'Not Implemented', '_request' => $VAR1->{'req'} }, 'HTTP::Response' ), 'page_stack' => [], 'redirected_uri' => $VAR1->{'req'}{'_uri'}, 'requests_redirectable' => [ 'GET', 'HEAD', 'POST' ], 'from' => undef, 'timeout' => 180, 'parse_head' => 1, 'base' => bless( do{\(my $o = 'http://myprodportalv2. +unocal.com:7778/portal/page?_pageid=33,30917,33_30962&_dad=puno2o&_sc +hema=PORTAL_PUNO2O')}, 'URI::http' ), 'quiet' => 0, 'protocols_forbidden' => undef, 'no_proxy' => [], 'protocols_allowed' => undef, 'use_eval' => 1, 'agent' => 'Mozilla/4.0 (compatible; MSIE 6.0; Window +s NT 5.1)', 'cookie_jar' => bless( { 'COOKIES' => {} }, 'HTTP::Cookies' ), 'proxy' => {}, 'max_size' => undef }, 'WWW::Mechanize' );

        How did you arrive on that page? I'm asking that question, as my guess is that this page has been originally generated from a POST request and the script dosen't know how to handle a GET request.

        To see whether the script can handle ordinary GET requests, simply paste the url into the browser adress bar or use the GET command installed with LWP.

        If all of this proves fruitless, there'll be no recourse to log the access with your normal browser and compare it to the access with LWP and/or Mechanize, and work out the differences.

        perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web
        Hmm, well, whatever the problem is, it's with your server (that's as much as you can do from perl). Try a passing all those form parameters via a POST request. Get some packet capturing software, and capture a session from a browser (which apparently works?) and capture a mechanize sessions and compare (your finicky server is probably expecting some headers your request doesn't have).

        MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
        I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
        ** The third rule of perl club is a statement of fact: pod is sexy.