in reply to Requesting webpages which use cookies and session ids. (rev)

Baz,

I guess the session ID and engine ID are connected to the search, so try reusing the old IDs. I think for the whole script, you should only need to get a sessID and engID once -- unless you perform more than 10 searches (not counting looking at the 2nd page of a search etc) or your script pauses for a few minutes between the requests.

Update: I can query the page using lynx just fine, even with disallowing all cookies. So the answer is not in the cookie jar.

Replies are listed 'Best First'.
Re: Re: Requesting webpages which use cookies and session ids. (rev)
by Baz (Friar) on Aug 04, 2002 at 22:46 UTC
    Thanks crenz,
    Yeah, I made the same observation when I rejected cookies using lynx. THe thing is, when I view the first search page using yr original hack, I get the first page of results but I also get a message(at the start of the page) saying I have been disconnected from the search.
    Also if you do the search in IE6, you will only see the ids being passed via the query string for the first page of search results. For subsequent pages (when you click NEXT) you wont see any mention of ids in the query string. But, for our program, if you include the two ids in the string anyway, you get a message (cant remember exactly) but something about the server being buzy (the server isnt buzy - but what ever loop your fall out of you end up getting this message). If you leave the ids out of the search string(as in my code at the moment), you get a message saying that the searching utility only works for Netscape + IE. For both attempts, you get no results for all attempts to veiw beyond the first page of results - instead you get one of the two afore mentioned error messages. Therefore I'm guessing that the ids need to be passed, but IE is using a different method perhaps - i really dont know at this stage. Maybe when you include the ids the second time, the server thinks its processing the first again (i.e. it uses the existance of the ids in the search string to establish if its the 1st results page or a subsequent one)...and thats why you get two different sets of errors, for subsequent search pages, dispite the fact that I would have expected the server to ignore the ids as the NEXT links dont contain them.
    Just now I've tried repeating the search in IE. When the first page of results displayed, I copied the url in the address window and removed the engine and session ids, I then opened up netscape, pasted in the new url and the search worked fine. THerefore I dont think the browsers ever needs to recieve the ids via the the query string. I'm lost, how about you?
      just one of the two afore mentioned error messages subsequent Sorry Jenda, I missed your post....I commented out that conversion line and now I'm getting

      To use this service you will need either an Internet Explorer (IE) browser or Netscape 4.7 and above.

      for the first search page now aswell as for subsequent pages - at least theres some consistency there. :)
        1. If the server says you need either MSIE or Netscape ver x.x you have to fake you use it: $ua->agent('Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)');
        2. You do not have to fiddle with cookies that much. After you give the $ua the cookie jar you do not have to worry.
        3. The parameters should tno be POSTed to the script, they should be passed in the query string:
          use CGI::Enurl; my $request = GET $url_search . '?' . enurl { QRY => 'res', BV_SessionID => $sessID, BV_EngineID => $engID, new_search => 'true', NAM => 'Griffin', GIV => '', LOC => '', STR => '', PCD => 'BT', limit => '50', CallingPage => 'Homepage', };
        4. The site sends a malformated cookie that is not remembered by LWP::UserAgent. Therefore you have to send it with the last request explicitely:
          $request = GET $url_search . '?' . enurl { QRY => 'res', NAM => 'Griffin', lci => '0', PCD => 'BT', start_id => '50', CallingPage => 'Homepage', }, 'Set-Cookie' => "'BV_IDS=$engID:$sessID; path=;";
        5. The URL for the searching is not ".../dq_locationfinder.jsp". It's the ".../dq_home.jsp" itself. ".../dq_locationfinder.jsp" only serves for finding the location on a graphical map.
        6. After all these changes the script worked fine for me. You may find it at http://jenda.krynicky.cz/perl/BT_search.pl.txt.

          HTH, Jenda