in reply to WWW Mechanize not keeping session?

There is the number one rule for automating websites:

  1. Compare what your script sends to what your browser sends, and make your script send the identical data.

Otherwise, you are just stabbing in the dark, without really knowing where the differences are. My personal guess is that somewher/somehow the frames are tripping you up, or there is another frame that is supposed to send you a cookie, which you are not loading. There are many possibilities, but only you can check them out by comparing what your script sends against what your browser sends. I recommend the HTTP Live Headers for FireFox and Mozilla for convenient logging of browser requests, but HTTP::Recorder should also do this.

Update: Fixed link to Live HTTP Headers, spotted by kutsu

Replies are listed 'Best First'.
Re^2: WWW Mechanize not keeping session?
by seaver (Pilgrim) on Oct 05, 2004 at 15:24 UTC
    Corion

    I obviously don't understand cookies and the like.

    I followed your lead, and used HTTP::Recorder. I got it working, it failed with the session id too.

    It actually failed EARLIER in the number of links I need to go through, how is this possible? The session id gets changed in like the second link, but when I use my own script, the session id never gets changed, it just doesn't work.

    Any more ideas?
    Sam

Re^2: WWW Mechanize not keeping session?
by seaver (Pilgrim) on Oct 05, 2004 at 15:30 UTC
    Corion

    I think I've jsut figured it out, you mentioned other frames, and that made me think. I double checked with the web-page itself, and it is indeed made up of two frames, and guess what, the Session id in the seperate frames are different!!

    I have no idea what to do with the different frames, or if it does make a difference that they have their own session ids. Since I can't get that far with HTTP::Recorder, what should I do?

    Thanks
    Sam

      Trace and record your browsing session with HTTP::Recorder, or with any other logging feature, like the Live HTTP Headers. Then trace and record the session as your script produces it. Then look at the differences and eliminate them.

      Most likely, you are using the wrong session ID, or are extracting the wrong cookie or something like that, but without looking at your script, the site you are trying to automate, and your successfull browser session, that's hard to tell - and if you have all three at your disposal, it's easy to see the differences, for example with the diff utility.

      I have no better help to offer you, because all the automated tools already have failed, and thus you will have to do the task manually.