Hi Adam--Thank you for taking the time to write. I am using a browser-mimmicking user-agent with LWP with the search that is giving me cookie problems.
So, I ran the script that you posted, and the cookie looked good. Yet still no dice on my original search.
Could you tell me how the $req->content and $req->content_type commands work?
I ran Firefox's live Headers when I originally put the post together, and here is what it looks like:
Here is the original post:
http://nl.newsbank.com/nl-search/we/Archives
POST http://nl.newsbank.com/nl-search/we/Archives HTTP/1.1
Host: nl.newsbank.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.8)
+Gecko/20050511 Firefox/1.0.4
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9
+,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Proxy-Connection: keep-alive
Referer: http://nl.newsbank.com/nl-search/we/Archives/?p_action=custom
+ized&s_search_type=customized&p_product=NewsLibrary&p_theme=newslibra
+ry2&d_sources=location&d_place=United%20States&p_nbid=&
Cookie: JServSessionIdnewslib=lgfbb4ahn1.JS36b
Proxy-Authorization: Basic c2NpY2FsYTphZWRmamtvOw==
Content-Type: application/x-www-form-urlencoded
Content-Length: 4721
s_siteloc=NL2&p_queryname=4000&p_action=search&p_product=NewsLibrary&p
+_theme=newslibrary2&s_search_type=customized&d_sources=location&d_pla
+ce=United+States&p_nbid=&p_field_psudo-sort-0=psudo-sort&f_multi=&p_m
+ulti=PFNB%7CBEHB%7CCWCB%7CVC%7CKCJB%7CTNTB%7COLPB%7CIG%7CSE%7CSR%7CVB
+JB%7CYHRB%7CPJCB%7COR%7CERGB%7CSSJB%7CCBWB%7CATSB%7CCFAB%7CACPB%7CBCA
+B%7CCLFB%7CCERB%7CCLKB%7CCC%7CWDDB%7CLA%7CHDRB%7CDVTB%7CDVEB%7CDSSB%7
+CFBAB%7CFB%7CGZ%7CHNSB%7CVDBB%7CLOPC%7CRCPB%7CLCRB%7CLODB%7CLB%7CLBEB
+%7CMIJB%7CMBNB%7CMSSB%7CMPPB%7CMS%7CMC%7CPCMB%7COKTB%7COC%7COMRB%7CPF
+TB%7CPPOB%7CPNTB%7CSA%7CRS%7CRBDB%7CRDFB%7CRWTB%7CCVRB%7CSB%7CSDUN%7C
+SFCB%7CSGVB%7CSJ%7CSMCB%7CSMTB%7CYCSB%7CPSWB%7CSBCB%7CTPRB%7CVTHB%7CE
+TSB%7CTVHB%7CSO%7CTARB%7CUDJB%7CVCSB%7CVTDB%7CWDNB%7CWNCB%7CAS%7CFDNB
+%7CJUEB%7CKTVB%7CHNAB%7CCDMB%7CCIZB%7CHHDB%7CDNLB%7CVNRB%7CVBOB%7CFFS
+B%7CRTDB%7CRO%7CNK%7CACTB%7CCO%7CECDB%7CGVRB%7CFV%7CGB%7CDHSB%7CRLOB%
+7CNCTB%7CWSNB%7CWJ%7CNWJB%7CAN%7CBDRB%7CSDTB%7CBS%7CWP%7CWT%7CAIMB%7C
+GRNB%7CRHHB%7CSHJB%7CCS%7CMB%7CAGCB%7CAT%7CCMNB%7CCLSB%7CCAAB%7CSGPB%
+7CCSRB%7CCL%7CMT%7CSMNB%7CSAVB%7CGMTB%7CMWTB%7CBH%7CCNCB%7CNJ%7CDSFB%
+7CDLAB%7CEN%7CESFB%7CFTUB%7CFLTB%7CLPMB%7CMH%7CFMNB%7CNPFB%7COSBB%7CP
+A%7CPBPB%7CPNJB%7CHT%7CSPTB%7CFS%7CTD%7CTT%7CTCBB%7CVGSB%7CVB%7CGFTB%
+7CIDRB%7CIDSB%7CLT%7CMV%7CWTEB%7CLVRB%7CRGJB%7CDSNB%7CLHJB%7CPCRB%7CS
+LTB%7CSGSB%7COSXB%7CBNTB%7CCSJB%7CDC%7CDP%7CGJSB%7CDRTB%7CFCCB%7CFMTB
+%7CCSGB%7CSJAB%7CMMNB%7CRBTB%7CPBJB%7CRM%7CABGB%7CADSB%7CARPB%7CLETB%
+7CTUCB%7CAODB%7CAJ%7CAQTB%7CCCAB%7CFDTB%7CLCSB%7CRONB%7CSF%7CSCPB%7CB
+K%7CGF%7CAB%7CSALB%7CLJSB%7CEM%7CHDNB%7CHCNB%7CJW%7CMM%7COHDB%7CTCJB%
+7CWE%7CNT%7CMFCB%7CSLLB%7CMMLB%7CSCTB%7CSP%7CMN%7CDMRB%7CCR%7CBHEB%7C
+ICPB%7CDQ%7CCLOB%7CKDRB%7CJGMB%7CKC%7CSNLB%7CSBRB%7CSDRB%7CSL%7CCGNB%
+7CWBDB%7CMDPB%7CWDTB%7CGRBB%7CMHRB%7CMFHB%7CMWSB%7COSHB%7CPTCB%7CFDRB
+%7CSHPB%7CSPJB%7CWFMB%7CWDHB%7CMLJB%7CMD%7CAQCB%7CARVB%7CAHPB%7CBCRB%
+7CBGCB%7CBC%7CND%7CCGCB%7CCN%7CCSTB%7CADHB%7CDSOB%7CDERB%7CDPLB%7CDWB
+B%7CDCHB%7CDEHB%7CHDGB%7CDLGB%7CDOBB%7CDWSB%7CETRB%7CETVB%7CEDIB%7CEG
+TB%7CELVB%7CEVRB%7CRFLB%7CFHJB%7CGLNB%7CGANB%7CGYRB%7CGURB%7CDH%7CHN%
+7CHPNB%7CHERB%7CLFRB%7CLHRB%7CLKRB%7CLZCB%7CLBVB%7CLIRB%7CLWRB%7CMWHB
+%7CMPHB%7CMGCB%7CMPTB%7CMRVB%7CNP%7CNS%7CCNGB%7CNHSB%7CHNNB%7CNBSB%7C
+NHJB%7COAKB%7CPCSB%7CBL%7CPHAB%7CJS%7CRGMB%7CRRSB%7CRMRB%7CSGRB%7CSKR
+B%7CSI%7CSTRB%7CJR%7CHITB%7CJPBB%7CVHRB%7CWWCB%7CWHDB%7CWCHB%7CWCSB%7
+CWMLB%7CWNTB%7CMCTB%7CEC%7CVHDB%7CINLB%7CIBJB%7CIN%7CLJCB%7CJG%7CFW%7
+CNDLB%7CRPIB%7CGPTB%7CMSPB%7CVSCB%7CBCEB%7CFP%7CDTNB%7CFL%7CGRPB%7CHD
+TB%7CLSJB%7CMLMB%7CSGNB%7CPTHB%7CVPTB%7CNADB%7CAK%7CTB%7CCGZB%7CCEQB%
+7CCK%7CCLDB%7CCOTB%7CDDNB%7CHONB%7CLEGB%7CLM%7CMSTB%7CMOJB%7CPCNB%7CM
+NJB%7CFNMB%7CCPDB%7CSNSB%7CBTFB%7CZTRB%7CBD%7CKBJB%7CWMSB%7CME%7CCMOB
+%7CNTGB%7CUL%7CBBRB%7CBRFB%7CBFPB%7CBCTB%7CPBEB%7CBG%7CBNHB%7CCAPB%7C
+CT%7CDPNB%7CGLMB%7CHHSB%7CNATB%7CPFPB%7CAPSB%7CUN%7CFSEB%7CSOCB%7CLO%
+7CTTMB%7CWO%7CCTPB%7CDRNB%7CDNCB%7CFFCB%7CGCZB%7CHC%7CNNRB%7CNHRB%7CM
+FSB%7CDNEB%7CNCNB%7CNRBB%7CWRAB%7CSCYB%7CWTNB%7CWMZB%7CBN%7CDNBB%7CDR
+RB%7CDLPB%7CIHJB%7CJC%7CWJNB%7CLINB%7CNYOB%7CNY%7CNYSB%7CNWDB%7CUODB%
+7CPPRB%7CSY%7CPKJB%7CBPSB%7CRDCB%7CESGB%7CAL%7CWA%7CASBB%7CBYJB%7CWBC
+B%7CCYJB%7CBCNB%7CCHCB%7CVDJB%7CMDRB%7CENEB%7CHUDB%7CENPB%7CENUB%7CEB
+TB%7CNJJB%7CKRJB%7COCOB%7CAC%7CBE%7CSCJB%7CSTLB%7CWFJB%7CLBCB%7CBDWB%
+7CLDNB%7CDT%7CDCCB%7CET%7CHESB%7CDPIB%7CLC%7CKLTB%7CAMCB%7CNCPB%7CWOR
+B%7CDN%7CPI%7CPG%7CPTRB%7CCPOB%7CWB%7CTRGB%7CVIMB%7CKVNB%7CYK%7CYKDB%
+7CLCJB%7CBGDB%7CKYPB%7CLH%7CLEOB%7COMIB%7CCA%7CMDTB%7CJKSB%7CKX%7CCLC
+B%7CNTNB%7CJCLB%7CMHAB%7CBX%7CBI%7CDSAB%7CMBRB%7CMGAB%7CAMRB%7CFPJB%7
+CRWPB%7CDAOB%7COK%7COJRB%7COBNB%7CMTDB%7CTLWB%7CARNB%7CADAB%7CAGNB%7C
+AGTB%7CAMGB%7CAASB%7CBTSB%7CBT%7CBZFB%7CCCCB%7CNDSB%7CDM%7CDRCB%7CEPT
+B%7CST%7CGDNB%7CGRMB%7CHCBF%7CKDTB%7CLAEB%7CLASB%7CLTJB%7CLAJB%7CLTNB
+%7CTXMB%7CMRTB%7CPDHB%7CSAGB%7CSAEC%7CSGEB%7CTCSB%7CVA%7CWTHB%7CBBTB%
+7CAD%7CADTB%7CNSRB%7CLADB%7CODWB%7CNOBB%7CMNSB%7CSTIB%7CTP%7CATNB%7CA
+TYB%7CAMBB%7CAMNB%7CASRB%7CTDNB%7CBICB%7CBLRB%7CTNBB%7CBBYB%7CBMZB%7C
+CTGB%7CCTMB%7CCQDB%7CCDRB%7CCMGB%7CCUJB%7CRDMB%7CEBAB%7CEBNB%7CEBCB%7
+CEQUB%7CFPGB%7CHDAB%7CHYRB%7CINWB%7CIBDB%7CIDGB%7CIMWB%7CMARB%7CDMJB%
+7CMMEB%7CMVNB%7CTMTB%7CNMNB%7CNBRB%7CTMWB%7COTNB%7CTPAB%7CPPLB%7CRGRB
+%7CSDNB%7CTMZB%7CTMBB%7CAPAB%7CBBAB%7CKR%7CSN%7CTNSB%7CWSOB%7CCNEB%7C
+USLB%7CCBSB%7CCRSB%7CCNBB%7CCNNB%7CCNIB%7CCNFB%7CFDCB%7CFOXB%7CSNBB%7
+CNR%7CNBCB%7CCSMB%7CNWEC%7CUW&p_widesearch=smart&p_sort=YMD_date%3AD&
+p_maxdocs=200&p_perpage=10&p_text_base-0=please+work&p_field_base-0=&
+p_bool_base-1=AND&p_text_base-1=&p_field_base-1=&p_bool_base-2=AND&p_
+text_base-2=&p_field_base-2=&p_text_YMD_date-0=1%2F1%2F2000+to+6%2F1%
+2F2000&p_field_YMD_date-0=YMD_date&p_params_YMD_date-0=date%3AB%2CE&p
+_field_YMD_date-3=YMD_date&p_params_YMD_date-3=date%3AB%2CE&Search.x=
+37&Search.y=21&Search=Search
HTTP/1.x 200 OK
Date: Thu, 14 Jul 2005 16:12:34 GMT
Server: Apache/1.3.26 (Unix) ApacheJServ/1.1.2
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Pragma: no-cache
Content-Type: text/html
X-Cache: MISS from shade.uchicago.edu
Proxy-Connection: close
----------------------------------------------------------
http://nl.newsbank.com/nl-search/we/Archives
GET http://nl.newsbank.com/nl-search/we/Archives HTTP/1.1
Host: nl.newsbank.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.8)
+Gecko/20050511 Firefox/1.0.4
Accept: */*
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Proxy-Connection: keep-alive
Referer: http://nl.newsbank.com/nl-search/we/Archives
Cookie: JServSessionIdnewslib=lgfbb4ahn1.JS36b
Proxy-Authorization: Basic c2NpY2FsYTphZWRmamtvOw==
HTTP/1.x 200 OK
Date: Thu, 14 Jul 2005 16:12:55 GMT
Server: Apache/1.3.26 (Unix) ApacheJServ/1.1.2
Cache-Control: no-store, no-cache, must-revalidate
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Pragma: no-cache
Content-Type: text/html
X-Pad: avoid browser bug
X-Cache: MISS from shade.uchicago.edu
Proxy-Connection: close
----------------------------------------------------------
http://nl.newsbank.com/favicon.ico
GET http://nl.newsbank.com/favicon.ico HTTP/1.1
Host: nl.newsbank.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.8)
+Gecko/20050511 Firefox/1.0.4
Accept: image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Proxy-Connection: keep-alive
Cookie: JServSessionIdnewslib=lgfbb4ahn1.JS36b
Proxy-Authorization: Basic c2NpY2FsYTphZWRmamtvOw==
HTTP/1.x 404 Not Found
Date: Thu, 14 Jul 2005 16:12:58 GMT
Server: Apache/1.3.26 (Unix) ApacheJServ/1.1.2
WWW-Authenticate: Basic realm="NewsLibrary"
Last-Modified: Mon, 03 Nov 2003 15:41:26 GMT
Etag: "1d3882-1491-3fa67726"
Accept-Ranges: bytes
Content-Length: 5265
Content-Type: text/html
X-Cache: MISS from shade.uchicago.edu
Proxy-Connection: keep-alive
----------------------------------------------------------
And then when I click on the next page, live headers says,
http://nl.newsbank.com/nl-search/we/Archives/?p_action=list&p_topdoc=1
+1&d_sources=location&p_nbid=
GET http://nl.newsbank.com/nl-search/we/Archives/?p_action=list&p_topd
+oc=11&d_sources=location&p_nbid= HTTP/1.1
Host: nl.newsbank.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.8)
+Gecko/20050511 Firefox/1.0.4
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9
+,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Proxy-Connection: keep-alive
Referer: http://nl.newsbank.com/nl-search/we/Archives
Cookie: JServSessionIdnewslib=lgfbb4ahn1.JS36b
Proxy-Authorization: Basic c2NpY2FsYTphZWRmamtvOw==
HTTP/1.x 200 OK
Date: Thu, 14 Jul 2005 16:13:27 GMT
Server: Apache/1.3.26 (Unix) ApacheJServ/1.1.2
Cache-Control: no-store, no-cache, must-revalidate
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Pragma: no-cache
Content-Type: text/html
X-Pad: avoid browser bug
X-Cache: MISS from shade.uchicago.edu
Proxy-Connection: close
----------------------------------------------------------
http://nl.newsbank.com/nl-search/we/Archives/?p_action=list&p_topdoc=1
+1&d_sources=location&p_nbid=
GET http://nl.newsbank.com/nl-search/we/Archives/?p_action=list&p_topd
+oc=11&d_sources=location&p_nbid= HTTP/1.1
Host: nl.newsbank.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.8)
+Gecko/20050511 Firefox/1.0.4
Accept: */*
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Proxy-Connection: keep-alive
Referer: http://nl.newsbank.com/nl-search/we/Archives/?p_action=list&p
+_topdoc=11&d_sources=location&p_nbid=
Cookie: JServSessionIdnewslib=lgfbb4ahn1.JS36b
Proxy-Authorization: Basic c2NpY2FsYTphZWRmamtvOw==
HTTP/1.x 200 OK
Date: Thu, 14 Jul 2005 16:13:33 GMT
Server: Apache/1.3.26 (Unix) ApacheJServ/1.1.2
Cache-Control: no-store, no-cache, must-revalidate
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Pragma: no-cache
Content-Type: text/html
X-Pad: avoid browser bug
X-Cache: MISS from shade.uchicago.edu
Proxy-Connection: close
----------------------------------------------------------
http://nl.newsbank.com/favicon.ico
GET http://nl.newsbank.com/favicon.ico HTTP/1.1
Host: nl.newsbank.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.8)
+Gecko/20050511 Firefox/1.0.4
Accept: image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Proxy-Connection: keep-alive
Cookie: JServSessionIdnewslib=lgfbb4ahn1.JS36b
Proxy-Authorization: Basic c2NpY2FsYTphZWRmamtvOw==
HTTP/1.x 404 Not Found
Date: Thu, 14 Jul 2005 16:13:36 GMT
Server: Apache/1.3.26 (Unix) ApacheJServ/1.1.2
WWW-Authenticate: Basic realm="NewsLibrary"
Last-Modified: Mon, 03 Nov 2003 15:41:03 GMT
Etag: "d528c-1491-3fa6770f"
Accept-Ranges: bytes
Content-Length: 5265
Content-Type: text/html
X-Cache: MISS from shade.uchicago.edu
Proxy-Connection: keep-alive
----------------------------------------------------------
In my code I post for the first search, get for the next page, and use referer fields on both. But as you can see there is a lot of stuff in the Live Headers that I don't know what it's talking about.
Any help would be much appreciated.
Steve |