in reply to Re^3: WWW::Mechanize::Firefox delayed returns / slow
in thread WWW::Mechanize::Firefox delayed returns / slow

I've added a lot of debug code to the module to try various things. Maybe you can help make sense of the results?

First off, a very strange thing: I put a warn before and after $callback->() in synchronize(). On the very first hit to that code in a run it always takes 30+ secs to execute a dummy callback!!? The output is below. The long number is time(). The $VAR output is Dumper($callback).

1291361165 zb before cb $VAR1 = sub { "DUMMY" }; 1291361200 zc after cb, before wait

After that first hit all subsequent $callback->() (which are also all DUMMY) take 0 secs! This doesn't make any sense to me at all. How can a dummy call take 30 secs?

I've tried adding load to the events (globally). Right before the $load_lock=... in synchronize() I added a Dumper($events) to confirm it's there:

$VAR1 = [ 'load', 'DOMFrameContentLoaded', 'DOMContentLoaded', 'error', 'abort', 'stop' ];

I also tried it with just load (no other events). In all cases _wait_while_busy loops until my 20sec timer stops it, so no change there. Any other events I can wait on?

I also Dumper'd the $element in the for $element loop of _wait_while_busy, but that returns some massive structures that don't mean much to me (you're losing me with all the javascript code). However, if tidbits could help debug, I can post here as required.

Replies are listed 'Best First'.
Re^5: WWW::Mechanize::Firefox delayed returns / slow
by Corion (Patriarch) on Dec 03, 2010 at 07:43 UTC

    Data::Dumper does by default replace all callbacks with sub { "DUMMY" }. Most likely, these callbacks wait for some results from Firefox.

Re^5: WWW::Mechanize::Firefox delayed returns / slow
by Corion (Patriarch) on Dec 03, 2010 at 07:44 UTC

    Data::Dumper does by default replace all callbacks with sub { "DUMMY" }. Most likely, these callbacks wait for some results from Firefox.

    If you have a sample script and the website is somewhat public, I can try to reproduce the slowness.

      Ah yes, that makes sense. When I commented out the $callback line nothing would happen in the browser :-) I didn't think Dumper was lying to me.

      Does it still strike you as odd that the first $callback->() hit takes 30-35 secs everytime but 2nd+ calls are immediate?

      Also, I just noticed that while the first call to callback is slow, the first call of the program to the _wait_while_busy while loop takes only 1 sec. It's always the 2nd+ calls that hang. Does that make any sense?

      Also, why have you commented out the if ($need_response) in synchronize()? Since I never care about responses, I'm playing around with commenting out the $response_catcher= assignment to avoid all the voodoo in _install_response_header_listener. As you can see I'm shooting in the dark, but experimenting can't hurt anything.

      The site I am working with is a semi-private intranet. To possibly get you access I'd have to jump through a lot of hoops.

      If you can just throw me little crumbs of help I can do all the grunt work testing/debugging.

      Thanks!

        The $need_response is a (failed) optimization. I always need to store the response, even if it is not requested immediately. Later on, you might ask for $mech->code or other stuff contained only in the response.

        The rest of the behaviour depends on the site in question, so I can't really say what makes it happen without seeing some more, sorry.

      Since all the problems seem to happen on the 2nd+ calls, I tried to replace all mech calls with new/call/undef's. So instead of starting 1 mech and doing lots of get()s I am doing:

      $www=WWW::Mechanize::Firefox->new() $www->events() $www->get() undef $www

      Strangely enough, this change seems to do nothing at all to change the _wait_while_busy hang behaviour! Well, it did one thing, the very first callback call took only 0 sec. Very strange. I'm reverting back to my old code which makes only 1 call to w:m:ff->new().

      I should note that I am doing nothing with these pages I'm loading. I'm not filling in forms, or doing other mech stuff. I'm just get()ing, running some regexes on the content()s and then doing a saveurl() of a related file and on to the next get().

      I also tried adding this to the wait while loop, to no effect: $self->repl->poll;

        Maybe it is the call to ->content that is slow? Consider using ->selector() and/or ->xpath to extract the element and then ->{innerHTML} to get at its contents.

        All your changing around of the ->_wait_while_busy subroutine will only destabilize the whole thing as your script will not wait anymore for Firefox to signal it is ready. Doing this without knowing when and why to do it will only end in tears. It is an internal method and should not be ignored lightly (and if you want to ignore it, there most likely are routines more to the point than this one).