seaver has asked for the wisdom of the Perl Monks concerning the following question:

Dear all

I've been tackling this little problem for a while now, and I even tried using WWW::Mechanize::Shell but it produces the same 'session expired' error

#!/usr/bin/perl -w require strict; use lib '/home/visitors/seaver/modules/lib/perl5/site_perl/5.6.1'; use WWW::Mechanize; my $mech = WWW::Mechanize->new(); $mech->get("http://isiknowledge.com/?DestApp=WOS&Func=Frame"); $mech->follow_link(tag => 'frame', n => 2); $mech->click('General Search', [2,2]); $mech->field("topic","emergence AND complexity"); $mech->click(); $sid = $mech->value('SID'); { local $^W = 0; $mech->field('SID',$sid); } $mech->click_button(name=>'Submit'); print $mech->content;
Basically, everyhing goes well until the last 'click_button', whereupon the result I get is a 'you need to start a new session' page. The session id is kept constant in a hidden field in the form, and I've even tried 'forcing' this as you can see within a scope with warnings turned off

Does anyone have any suggestions as to what I may be doing wrong?

Thanks
Sam

Replies are listed 'Best First'.
Re: WWW Mechanize not keeping session?
by Corion (Patriarch) on Oct 05, 2004 at 06:34 UTC

    There is the number one rule for automating websites:

    1. Compare what your script sends to what your browser sends, and make your script send the identical data.

    Otherwise, you are just stabbing in the dark, without really knowing where the differences are. My personal guess is that somewher/somehow the frames are tripping you up, or there is another frame that is supposed to send you a cookie, which you are not loading. There are many possibilities, but only you can check them out by comparing what your script sends against what your browser sends. I recommend the HTTP Live Headers for FireFox and Mozilla for convenient logging of browser requests, but HTTP::Recorder should also do this.

    Update: Fixed link to Live HTTP Headers, spotted by kutsu

      Corion

      I obviously don't understand cookies and the like.

      I followed your lead, and used HTTP::Recorder. I got it working, it failed with the session id too.

      It actually failed EARLIER in the number of links I need to go through, how is this possible? The session id gets changed in like the second link, but when I use my own script, the session id never gets changed, it just doesn't work.

      Any more ideas?
      Sam

      Corion

      I think I've jsut figured it out, you mentioned other frames, and that made me think. I double checked with the web-page itself, and it is indeed made up of two frames, and guess what, the Session id in the seperate frames are different!!

      I have no idea what to do with the different frames, or if it does make a difference that they have their own session ids. Since I can't get that far with HTTP::Recorder, what should I do?

      Thanks
      Sam

        Trace and record your browsing session with HTTP::Recorder, or with any other logging feature, like the Live HTTP Headers. Then trace and record the session as your script produces it. Then look at the differences and eliminate them.

        Most likely, you are using the wrong session ID, or are extracting the wrong cookie or something like that, but without looking at your script, the site you are trying to automate, and your successfull browser session, that's hard to tell - and if you have all three at your disposal, it's easy to see the differences, for example with the diff utility.

        I have no better help to offer you, because all the automated tools already have failed, and thus you will have to do the task manually.