I decided to do some investigative work for you since I have nothing to do right now at 2:00 in the AM. I decided to capture a few submissions from the browser (IE 6.0) and analyze what was submitted.

Since this is long as all fling flang, I employed a readmore.

I decided to submit "box" and check out the results.

POST /entrez/query.fcgi?CMD=Pager&DB=PubMed HTTP/1.1 Referer: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Search&DB=P +ubMed Accept-Language: en-us Content-Type: application/x-www-form-urlencoded Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) Host: www.ncbi.nlm.nih.gov Content-Length: 296 Connection: Keep-Alive Cache-Control: no-cache Cookie: LB-Hint-PubMed=0SlUKGwUky2Rn3_3B5G; WebEnv=09ZSqwvMGrYSnbQNfTt +bZquWN8EWwljKW624FLtMbtqQuh4E9eSM5k WebEnv=09ZSqwvMGrYSnbQNfTtbZquWN8EWwljKW624FLtMbtqQuh4E9eSM5k&db=PubMe +d&orig_db=PubMed&term=box&cmd=&cmd_current=&query_key=4&dopt=DocSum&d +ispmax=20&sort=&SendTo=Text&textpage=1&inputpage=2&CrntRpt=DocSum&sho +wndispmax=20&page=0&dopt1=DocSum&dispmax1=20&sort1=&SendTo1=Text&text +page1=1&inputpage1=

And the response header:

HTTP/1.1 200 OK Date: Mon, 30 Jun 2003 06:59:19 GMT Server: Apache Set-Cookie: LB-Hint-PubMed=0y_fmGJi19Cg5T8fvAR; domain=.nlm.nih.gov; p +ath=/; expires=Mon, 30-Jun-2003 07:04:19 GMT Set-Cookie: WebEnv=0XdSUuxNcjqgflC3VlxDhofrq5G4V7_aVtA9dsKTtm3Bq-wvz1O +M5U; domain=.nlm.nih.gov; path=/; expires=Mon, 30-Jun-2003 14:59:19 G +MT Content-Type: text/html Via: 1.1 www.ncbi.nih.gov X-Cache: MISS from www.ncbi.nih.gov Keep-Alive: timeout=5, max=100 Connection: Keep-Alive Transfer-Encoding: chunked

So what can we tell about all this mumbo jumbo? A form is posted to /entrez/query.fcgi?CMD=Pager&DB=PubMed. Notice that the site uses cookies and the cookie is changed on every access to the page. Also, there are 22 fields you need to submit (everything should be submitted as is unless it has an arrow pointing to it):

WebEnv= <---Same as what's in your cookie, changes every time db=PubMed orig_db=PubMed term= <---Search term cmd= cmd_current= query_key= <---Need to extract out of the first page (server seems to + enumerate searches...this will eventually get wiped out if you wait +too long) dopt=DocSum dispmax=20 sort= SendTo=Text textpage= <---Current page numerically inputpage= <---Page we're going to CrntRpt=DocSum showndispmax=20 page= <---Page prior to textpage dopt1=DocSum dispmax1=20 sort1= SendTo1=Text textpage1= <---Always seems to be the same as textpage inputpage1=

Now you're probably wondering to yourself, "gee antirice, is that all there is to it?" To tell you the truth I really don't know. I captured a total of five transactions with three different search terms and compared them to see what was different and what correlated to what. I was just doing this to help you since I run into this quite frequently and the Javascript module works ok but not always as well as one would hope.

In case you were wondering, I captured everything using a program called ethereal. I find it to be a very handy tool when trying to mimic any sort of transaction without documentation. Please note what we have here is a hack at best. If they really wanted you to do this in an automated fashion, they probably would have provided a documented interface to the search engine. Of course, it's the government and they may have not gotten around to it yet.

Hope this helps. Pray that they never change the interface :-P

antirice    
The first rule of Perl club is - use Perl
The
ith rule of Perl club is - follow rule i - 1 for i > 1


In reply to Re: activating javascript with perl by antirice
in thread activating javascript with perl by dannoura

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.