xeroxed_yeti has asked for the wisdom of the Perl Monks concerning the following question:

Hi Perlmonks,

I wan't to parse a webpage which uses the POST method and SessionIDs.

Using Wireshark and following of my tcp my Browser ends up with following header:
GET /blabla/faces/infoSuche.jsp HTTP/1.1 Host: www.blabla.de User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.9.1.3) Gecko/ +20090824 Firefox/3.5.3 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0. +8 Accept-Language: en-gb,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Cookie: JSESSIONID=A51C35282459D387BF075973B9D5CB57 POST /blabla/faces/infoSuche.jsp HTTP/1.1 Host: www.blabla.de User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.9.1.3) Gecko/ +20090824 Firefox/3.5.3 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0. +8 Accept-Language: en-gb,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://www.blabla.de/blabla/faces/infoSuche.jsp Cookie: JSESSIONID=A51C35282459D387BF075973B9D5CB57 Content-Type: application/x-www-form-urlencoded Content-Length: 366 j_id_jsp_1095591257_1=j_id_jsp_1095591257_1&j_id_jsp_1095591257_1%3Apf +legeart=1000&j_id_jsp_1095591257_1%3Apflegeart=9000&j_id_jsp_10955912 +57_1%3Apflegeart=6000&j_id_jsp_1095591257_1%3ApLZ=55425&j_id_jsp_1095 +591257_1%3Aentfernung=30&j_id_jsp_1095591257_1%3Aort=&j_id_jsp_109559 +1257_1%3Aj_id_jsp_1095591257_49=Suche+starten&javax.faces.ViewState=j +_id31091%3Aj_id31092
The bconnection is always keep_alive/b. However my Perl Script won't do that and I don't know how to configure LWP::UserAgent to do this task...

Here is my subroutine with the UserAgent part:

sub establishConnection{ my ($url) = @_; # create UserAgent my $agent = LWP::UserAgent->new(); # keep alive $agent->conn_cache(LWP::ConnCache->new()); # the newly created connection cache object will cache only one co +nnection at time # To have it cache more, total_capacity attribute is used. # total_capacity(10) will cache 10 connection, undef with no limit +s $agent->conn_cache->total_capacity(undef); # user agent $agent->agent('Opera/9.80 (X11; Linux i686; U; en) Presto/2.2.15 V +ersion/10.10'); # timeout $agent->timeout(); # cookies $agent->cookie_jar( {} ); # post request my $response = $agent->post($url, [ 'j_id_jsp_1095591257_1' => 'j_id_jsp_1095591257_1', 'j_id_jsp_1095591257_1:pflegeart' => '1000', 'j_id_jsp_1095591257_1:pLZ' => '55425' , 'j_id_jsp_1095591257_1:entfernung' => '20', 'j_id_jsp_1095591257_1:ort' => '', 'j_id_jsp_1095591257_1:j_id_jsp_1095591257_49' => 'Suche+starten' +, 'javax.faces.ViewState' => 'j_id10035:j_id10037', ] ); print $response->content(); }
The response always ends with a "session timeout"... Here is the header files which are generated by my scritp:
POST /blabla/faces/infoSuche.jsp HTTP/1.1 TE: deflate,gzip;q=0.3 Keep-Alive: 300 Connection: Keep-Alive, TE Host: www.blabla.de User-Agent: Opera/9.80 (X11; Linux i686; U; en) Presto/2.2.15 Version/ +10.10 Content-Length: 0 Content-Type: application/x-www-form-urlencoded HTTP/1.1 200 OK Date: Thu, 31 Dec 2009 11:33:15 GMT Server: Apache X-Powered-By: Servlet 2.4; JBoss-4.2.1.GA (build: SVNTag=JBoss_4_2_1_G +A date=200707131605)/Tomcat-5.5 X-Powered-By: JSF/1.2 Set-Cookie: JSESSIONID=20D3FDFCB66F6D1A75766D55011A6A29; Path=/ Content-Language: de-DE Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Transfer-Encoding: chunked Content-Type: text/html;charset=UTF-8 POST /blabla/faces/infoSuche.jsp HTTP/1.1 TE: deflate,gzip;q=0.3 Connection: TE Host: www.blabla.de User-Agent: Opera/9.80 (X11; Linux i686; U; en) Presto/2.2.15 Version/ +10.10 Content-Length: 290 Content-Type: application/x-www-form-urlencoded Cookie: JSESSIONID=20D3FDFCB66F6D1A75766D55011A6A29 Cookie2: $Version="1" j_id_jsp_1095591257_1=j_id_jsp_1095591257_1&j_id_jsp_1095591257_1%3Apf +legeart=1000&j_id_jsp_1095591257_1%3ApLZ=55425&j_id_jsp_1095591257_1% +3Aentfernung=20&j_id_jsp_1095591257_1%3Aort=&j_id_jsp_1095591257_1%3A +j_id_jsp_1095591257_49=Suche%2Bstarten&javax.faces.ViewState=j_id1003 +5%3Aj_id10037HTTP/1.1 500 viewId:/infoSuche.jsp - View /infoSuche.jsp + could not be restored. Date: Thu, 31 Dec 2009 11:33:15 GMT Server: Apache ETag: W/"736-1258462944000" Last-Modified: Tue, 17 Nov 2009 13:02:24 GMT Content-Length: 736 Connection: close Content-Type: text/html
Using several agents like Mozilla, Opera, etc didn't work. Please, I need your advice :)

Thanks a lot!

Replies are listed 'Best First'.
Re: LWP connection:keep_alive
by gmargo (Hermit) on Dec 31, 2009 at 13:58 UTC

    One potential problem is that you do not have a Referer: header, which is required by many sites.

    The code you posted confused me a bit. You generate a new LWP::UserAgent and then immediately do a POST. Where's the GET that set the cookie?

      How do I usually set the cookie? I thought :$agent->cookie_jar( {} ); would do that stuff...

      ...and do I need the GET command when I submit data via POST?

      Setting $agent->default_header('Referer' => 'http:blabla'); also ends with a closed sessionid

      However could that be the result of the missing GET command to set the cookie?