in reply to Re^2: WWW::Mechanize::Firefox delayed returns / slow
in thread WWW::Mechanize::Firefox delayed returns / slow

->synchronize waits on the appropriate event to fire in Firefox. If it takes longer than you expect, most likely you're waiting for the wrong event. See the ->events method and the events => argument to the constructor on how to define your own events.

Just adding a timeout in ->_wait_while_busy only papers over the fact that you're not receiving (or rather, listening to) the right event.

Now, finding the right event to listen to requires some ingenuity, as the default events (DOMFrameContentLoaded, DOMContentLoaded, error, abort, stop) don't seem to fire "soon enough" in your case. Maybe load is another good event to listen to, but it fires before subframes have loaded.

Replies are listed 'Best First'.
Re^4: WWW::Mechanize::Firefox delayed returns / slow
by tcordes (Novice) on Dec 03, 2010 at 06:53 UTC

    My program is very simple. I am not doing any event stuff myself. I'm just doing extremely simple get() and saveurl(). However, I see what you are saying about events now that I've been playing with the module code for a few days.

    I'll see if I can figure out how to add the load event. I'm pretty sure my site has no subframes.

    I've taken to adding Data::Dumper calls to many places in the module to see what's going on. If any of those would help you, let me know. Most of the time the 100 pages of perl code that is Dumped for one variable leaves me dazed & confused. I've taken to restricting it to 2-3 levels deep.

    Thanks, I will try more things and report back!

Re^4: WWW::Mechanize::Firefox delayed returns / slow
by tcordes (Novice) on Dec 03, 2010 at 07:39 UTC

    I've added a lot of debug code to the module to try various things. Maybe you can help make sense of the results?

    First off, a very strange thing: I put a warn before and after $callback->() in synchronize(). On the very first hit to that code in a run it always takes 30+ secs to execute a dummy callback!!? The output is below. The long number is time(). The $VAR output is Dumper($callback).

    1291361165 zb before cb $VAR1 = sub { "DUMMY" }; 1291361200 zc after cb, before wait

    After that first hit all subsequent $callback->() (which are also all DUMMY) take 0 secs! This doesn't make any sense to me at all. How can a dummy call take 30 secs?

    I've tried adding load to the events (globally). Right before the $load_lock=... in synchronize() I added a Dumper($events) to confirm it's there:

    $VAR1 = [ 'load', 'DOMFrameContentLoaded', 'DOMContentLoaded', 'error', 'abort', 'stop' ];

    I also tried it with just load (no other events). In all cases _wait_while_busy loops until my 20sec timer stops it, so no change there. Any other events I can wait on?

    I also Dumper'd the $element in the for $element loop of _wait_while_busy, but that returns some massive structures that don't mean much to me (you're losing me with all the javascript code). However, if tidbits could help debug, I can post here as required.

      Data::Dumper does by default replace all callbacks with sub { "DUMMY" }. Most likely, these callbacks wait for some results from Firefox.

      Data::Dumper does by default replace all callbacks with sub { "DUMMY" }. Most likely, these callbacks wait for some results from Firefox.

      If you have a sample script and the website is somewhat public, I can try to reproduce the slowness.

        Ah yes, that makes sense. When I commented out the $callback line nothing would happen in the browser :-) I didn't think Dumper was lying to me.

        Does it still strike you as odd that the first $callback->() hit takes 30-35 secs everytime but 2nd+ calls are immediate?

        Also, I just noticed that while the first call to callback is slow, the first call of the program to the _wait_while_busy while loop takes only 1 sec. It's always the 2nd+ calls that hang. Does that make any sense?

        Also, why have you commented out the if ($need_response) in synchronize()? Since I never care about responses, I'm playing around with commenting out the $response_catcher= assignment to avoid all the voodoo in _install_response_header_listener. As you can see I'm shooting in the dark, but experimenting can't hurt anything.

        The site I am working with is a semi-private intranet. To possibly get you access I'd have to jump through a lot of hoops.

        If you can just throw me little crumbs of help I can do all the grunt work testing/debugging.

        Thanks!

        Since all the problems seem to happen on the 2nd+ calls, I tried to replace all mech calls with new/call/undef's. So instead of starting 1 mech and doing lots of get()s I am doing:

        $www=WWW::Mechanize::Firefox->new() $www->events() $www->get() undef $www

        Strangely enough, this change seems to do nothing at all to change the _wait_while_busy hang behaviour! Well, it did one thing, the very first callback call took only 0 sec. Very strange. I'm reverting back to my old code which makes only 1 call to w:m:ff->new().

        I should note that I am doing nothing with these pages I'm loading. I'm not filling in forms, or doing other mech stuff. I'm just get()ing, running some regexes on the content()s and then doing a saveurl() of a related file and on to the next get().

        I also tried adding this to the wait while loop, to no effect: $self->repl->poll;