in reply to activating javascript with perl

I decided to do some investigative work for you since I have nothing to do right now at 2:00 in the AM. I decided to capture a few submissions from the browser (IE 6.0) and analyze what was submitted.

Since this is long as all fling flang, I employed a readmore.

I decided to submit "box" and check out the results.

POST /entrez/query.fcgi?CMD=Pager&DB=PubMed HTTP/1.1 Referer: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Search&DB=P +ubMed Accept-Language: en-us Content-Type: application/x-www-form-urlencoded Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) Host: www.ncbi.nlm.nih.gov Content-Length: 296 Connection: Keep-Alive Cache-Control: no-cache Cookie: LB-Hint-PubMed=0SlUKGwUky2Rn3_3B5G; WebEnv=09ZSqwvMGrYSnbQNfTt +bZquWN8EWwljKW624FLtMbtqQuh4E9eSM5k WebEnv=09ZSqwvMGrYSnbQNfTtbZquWN8EWwljKW624FLtMbtqQuh4E9eSM5k&db=PubMe +d&orig_db=PubMed&term=box&cmd=&cmd_current=&query_key=4&dopt=DocSum&d +ispmax=20&sort=&SendTo=Text&textpage=1&inputpage=2&CrntRpt=DocSum&sho +wndispmax=20&page=0&dopt1=DocSum&dispmax1=20&sort1=&SendTo1=Text&text +page1=1&inputpage1=

And the response header:

HTTP/1.1 200 OK Date: Mon, 30 Jun 2003 06:59:19 GMT Server: Apache Set-Cookie: LB-Hint-PubMed=0y_fmGJi19Cg5T8fvAR; domain=.nlm.nih.gov; p +ath=/; expires=Mon, 30-Jun-2003 07:04:19 GMT Set-Cookie: WebEnv=0XdSUuxNcjqgflC3VlxDhofrq5G4V7_aVtA9dsKTtm3Bq-wvz1O +M5U; domain=.nlm.nih.gov; path=/; expires=Mon, 30-Jun-2003 14:59:19 G +MT Content-Type: text/html Via: 1.1 www.ncbi.nih.gov X-Cache: MISS from www.ncbi.nih.gov Keep-Alive: timeout=5, max=100 Connection: Keep-Alive Transfer-Encoding: chunked

So what can we tell about all this mumbo jumbo? A form is posted to /entrez/query.fcgi?CMD=Pager&DB=PubMed. Notice that the site uses cookies and the cookie is changed on every access to the page. Also, there are 22 fields you need to submit (everything should be submitted as is unless it has an arrow pointing to it):

WebEnv= <---Same as what's in your cookie, changes every time db=PubMed orig_db=PubMed term= <---Search term cmd= cmd_current= query_key= <---Need to extract out of the first page (server seems to + enumerate searches...this will eventually get wiped out if you wait +too long) dopt=DocSum dispmax=20 sort= SendTo=Text textpage= <---Current page numerically inputpage= <---Page we're going to CrntRpt=DocSum showndispmax=20 page= <---Page prior to textpage dopt1=DocSum dispmax1=20 sort1= SendTo1=Text textpage1= <---Always seems to be the same as textpage inputpage1=

Now you're probably wondering to yourself, "gee antirice, is that all there is to it?" To tell you the truth I really don't know. I captured a total of five transactions with three different search terms and compared them to see what was different and what correlated to what. I was just doing this to help you since I run into this quite frequently and the Javascript module works ok but not always as well as one would hope.

In case you were wondering, I captured everything using a program called ethereal. I find it to be a very handy tool when trying to mimic any sort of transaction without documentation. Please note what we have here is a hack at best. If they really wanted you to do this in an automated fashion, they probably would have provided a documented interface to the search engine. Of course, it's the government and they may have not gotten around to it yet.

Hope this helps. Pray that they never change the interface :-P

antirice    
The first rule of Perl club is - use Perl
The
ith rule of Perl club is - follow rule i - 1 for i > 1