doowah2004 has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to scrape a site to get info from a company's database so that I can build our own. It is our info and my use is totally legit, but the company no longer supports an app that would do this task for me. That said, I have been trying to use http::mechanize and I am having trouble logging in. They us frames so I have been calling the main page directly...Anyway, I decided to try http::recorder.

I believe that everything is setuip correctly because it logs basic browsing, but when I go to the site in question, it hangs and if left alone long enough perl.exe will ramp-up to 99% of the processor as reported by windows.

Can some one else try to access the page and see if it logs/hangs and try to log in to see if it records the attempt? Here is the page:
https://esolutions.landauerinc.com/lisn/protected/default.aspx
I can not supply valid user/pass details because of the sensitivity of the data.

Thank you all in advance!

p.s. They seem to use a mishmash o js and asp...
  • Comment on http::recorder hanging / using 99 percent processor

Replies are listed 'Best First'.
Re: http::recorder hanging / using 99 percent processor
by stonecolddevin (Parson) on Oct 18, 2006 at 23:14 UTC

    Can you at least show us some code that you've tried thus far? Some error messages you're receiving when trying to log into this site?

    meh.
      Here is the code to login to the site:

      use strict; use WWW::Mechanize; use Storable; my $login = "user"; my $password = "pass"; my $session_name = "me"; my $filename = "out.txt"; my $filename2 = "out2.txt"; my $login_url = "https://esolutions.landauerinc.com/lisn/protected +/default.aspx"; my $dh_url = "https://esolutions.landauerinc.com/lisn/protected/As +signedDose.aspx"; # login to your LISN account my $mech = WWW::Mechanize->new( autocheck => 1 ); $mech->get($login_url); $mech->submit_form( form_name => "_ctl0", fields => { txtUserName => $login, txtPassword => $password } ); # my $c = $mech->content; # open (FILE, ">out3.txt"); # print FILE $c; # close FILE; $mech->get($login_url ); #, ':content_file' => $filename ); # Enter Session Name $mech->submit_form( form_number => 1, fields => { txtUserName => $session } ); # Get Dose History Search Form #$mech->get( $dh_url, ':content_file' => $filename ); #print " ", -s $filename, " bytes\n";


      It is pretty straight forward, I based everything off of examples. I am calling $mech->get($login_url); twice because after a successful login it takes you back to a framed page, and I want to get back to the main page.

      Here is the code for the http::recorder script:
      use HTTP::Proxy; use HTTP::Recorder; my $proxy = HTTP::Proxy->new(); # create a new HTTP::Recorder object my $agent = new HTTP::Recorder; # set the log file (optional) $agent->file("httpout.txt"); # set HTTP::Recorder as the agent for the proxy $proxy->agent( $agent ); # start the proxy $proxy->start(); 1;


      I took this straight from the http::recorder docs. Basically I was just hoping that someone could verify if http::recorer function the same way for them when they visit that page, but I would love if someone wants to help further.

      As far as error codes. I am not receiving any, it is submitting the form but not loggin in, so I think it is the way that they are using/i am handling (or not handling) sessions.

      Thank you all again for looking at this.