mickey has asked for the wisdom of the Perl Monks concerning the following question:

Because of the firewall issues I'm dealing with that I asked about yesterday, I'm trying to use Win32::IE::Mechanize to scrape data from some HTTPS websites.

The issue I'm having is that the execution of my script doesn't seem to wait for IE to load the next page.

My code looks like this:

#!/usr/bin/perl use strict; use Win32::IE::Mechanize; my $ie = Win32::IE::Mechanize->new(visible => 1); my $url = "https://www.securesite.com"; print "Requesting '$url'... "; my $ok = $ie->get($url); $ok ? print "ok\n" : print "failed\n" && exit 1; print "Filling in login form... \n"; $ie->form_number(1); print "Entering username... \n"; $ie->field('username', 'username'); print "Entering password... \n"; $ie->field('password', 'password'); print "ok\n"; print "Submitting login form... "; $ok = $ie->click('_submit'); $ok ? print "ok\n" : print "failed\n" && exit 1; exit; print "Looking for link... \n"; my $link = $ie->find_link(n => 1); if (defined $link) { print "Found link... \n"; print "\tText: ".$link->text."\n"; print "\tURL: ".$link->url."\n"; print "ok\n" } else { print "failed\n"; die; } exit;

After the call to $ie->get(), the script waits until the page is fully loaded before printing 'ok', which is what it's supposed to do. But after $ie->click(), it prints 'ok' quite quickly, not immediately but before the page is loaded, and then the link that is returned from the subsequent method is the first link on the login page, rather than the first link on the page that is arrived at after login.

The page is being submitted -- the right page does load in the IE instance, but after the script has finished.

$ie->click() is actually a call to this method in the Win32::IE::Input class:

=head2 $input->click Calls the C<click()> method on the actual object. This may not work. =cut sub click { ${ $_[0] }->click }

which is not encouraging.

Does anyone have any suggestions as to how to get this to do the right thing?

Thanks!

Replies are listed 'Best First'.
Re: Getting Win32::IE::Mechanize to wait for responses
by Corion (Patriarch) on Mar 04, 2005 at 17:34 UTC

      Actually, Win32::IE::Mechanize does do the polling.

      After it calls the click() method on the input object, it calls an internal method called _wait_while_busy, which looks like this:

      sub _wait_while_busy { my $self = shift; my $agent = $self->{agent}; # The documentation isn't clear on this. # The DocumentComplete event roughly says: # the event gets fired (for each frame) after ReadyState == 4 # we might need to check if the first one has frames # and do some more checking. my $sleep = 4; # 0.4; # while ( $agent->{Busy} == 1 ) { $sleep and sleep( $sleep ) } # return unless $agent->{ReadyState}; while ( $agent->{ReadyState} <= 2 ) { $sleep and sleep( $sleep ); } $self->{ $_ } = undef for qw( forms cur_form links images ); return $self->success; }

      It looks like the author tried polling the Busy property, but that didn't work, and is trying the ReadyState property instead. That seems to work when the preceding call is $agent->navigate() (in Win32::IE::Mechanise::get()), but at least on my system it's breaking when the preceding call is $input->click().

      But thanks for the link to the MSDN docs; I'll have a look and see what I can find out.

      Hi, This is my first post at perlmonks.com - if I'm in the wrong place let me know. I have two questions? When I create an object with

      $IE = Win32::OLE->new('InternetExplorer.Application');

      What is retunred to $IE?(Question 1) From the syntax used in other progrmas it looks like a reference to a hash. But when I try and dereference it and print out the key values nothing is ouput to my terminal.

      while( ($key, $value) = each %$IE ) { print "$key => $value\n"; }

      Also, is there any perl specific documentation for this object?(Question 2) Thanks!

        Hello and welcome, allenaaker!

        In principle, there is nothing wrong with posting here, but it is better to start a question at Seekers of Perl Wisdom, as then many more people will see it.

        To your two questions:

        1. The object returned into $IE is a special object that tries to behave mostly like a normal Perl object, but in fact, it is some magical C object that cannot really be inspected. It interfaces to the InternetExplorer.Application object of Windows resp. the Internet Explorer browser.

        2. To find information about the InternetExplorer.Application, see the documentation by Microsoft. There is no Perl specific documentation, but it is not hard to translate the examples in VB into Perl.

      you can use
      $ie->{agent}->Document->readyState !~ /complete/i){ sleep(0.5); }
      It will work I am using this...Hope this is useful Any modification suggestions is appreciated Thanks Abhay K. Singh
Re: Getting Win32::IE::Mechanize to wait for responses
by johnnywang (Priest) on Mar 04, 2005 at 23:57 UTC
    There is also a module called SAMIE, (doesn't seem to be in CPAN, it's on sourceforge,) that does something similar to IE::Mechanize, it has some kind of WaitForDocumentComplete() function which might be helpful. BTW, I use IE::Mechanize, it seems to be ok.