in reply to Re^6: WWW::Mechanize::Firefox delayed returns / slow
in thread WWW::Mechanize::Firefox delayed returns / slow

The $need_response is a (failed) optimization. I always need to store the response, even if it is not requested immediately. Later on, you might ask for $mech->code or other stuff contained only in the response.

The rest of the behaviour depends on the site in question, so I can't really say what makes it happen without seeing some more, sorry.

Replies are listed 'Best First'.
Re^8: WWW::Mechanize::Firefox delayed returns / slow
by tcordes (Novice) on Dec 03, 2010 at 09:51 UTC

    OK, I've made a good small sample program using a public site.

    #!/usr/bin/perl -w $tabregex='deviantART'; $|=1; use Data::Dumper; use WWW::Mechanize::Firefox; $ENV{'SHELL'}='/bin/bash'; $Data::Dumper::Maxdepth=3; $www=WWW::Mechanize::Firefox->new( stack_depth=>5, autodie=>1, timeout +=>60, tab=>qr/$tabregex/, bufsize => 50000000 ); $www->events(['load','onload','loaded','DOMFrameContentLoaded','DOMCon +tentLoaded','error','abort','stop']); print time." get #1\n"; $www->get('http://cmcc.deviantart.com/'); print time." after get #1\n"; print time." content #1\n"; $con=$www->content(); print time." after content #1\n"; print time." get #2\n"; $www->get('http://cmcc.deviantart.com/#/d1a8l1t'); print time." after get #2\n"; print time." content #2\n"; $con=$www->content(); print time." after content #2\n"; print time." saveurl #1\n"; $www->save_url('http://fc02.deviantart.net/fs30/i/2008/048/d/9/Wind_by +_CMcC.jpg'=>'/tmp/Wind_by_CMcC'); print time." after get #2\n";

    With all the added debug prints I put in the module, my output looks like this. Note the time()'s which show insane delays at nearly every step. In fact, this sample program runs simply horribly. It's infinitely worse than my in-progress program which at least mostly works now. Note, I've taken out the wait 20 sec dropout code.

    1291368442 get #1 1291368443 za before $self->_addEventListener($b,$events); 1291368443 zb before $callback->(); 1291368443 zc before $self->_wait_while_busy($load_lock); 1291368443 testing elements, last element 0 ::: before for $element (@ +elements) { 1291368443 testing element, before if ::: before if ((my $s = $element +->{busy} || 0) >= 1) { 1291368443 uc _wait_while_busy sleep 1 ::: before sleep 1; 1291368444 testing elements, last element 0 ::: before for $element (@ +elements) { 1291368444 testing element, before if ::: before if ((my $s = $element +->{busy} || 0) >= 1) { 1291368509 returning from _wait_while_busy ::: before return $element; 1291368509 zd after wait 1291368509 after get #1 1291368509 content #1 1291368546 after content #1 1291368546 get #2 1291368547 za before $self->_addEventListener($b,$events); 1291368547 zb before $callback->(); 1291368547 zc before $self->_wait_while_busy($load_lock); 1291368547 testing elements, last element 0 ::: before for $element (@ +elements) { 1291368547 testing element, before if ::: before if ((my $s = $element +->{busy} || 0) >= 1) { 1291368547 uc _wait_while_busy sleep 1 ::: before sleep 1; 1291368548 testing elements, last element 0 ::: before for $element (@ +elements) { 1291368548 testing element, before if ::: before if ((my $s = $element +->{busy} || 0) >= 1) { Deep recursion on subroutine "MozRepl::RemoteObject::Instance::__attr" + at /usr/lib/perl5/site_perl/5.10.0/MozRepl/RemoteObject.pm line 1342 +, <DATA> line 1. Deep recursion on subroutine "MozRepl::RemoteObject::unjson" at /usr/l +ib/perl5/site_perl/5.10.0/MozRepl/RemoteObject.pm line 1000, <DATA> l +ine 1.

    Got a ton of deep recursion errors and had to ^C it. Note how long the content() call takes, and while running it takes up 100% of one of my cores.

      I used the following, slightly changed, script, but it works very well for me. I've removed the superfluous event setting and parameters that WWW::Mechanize::Firefox doesn't support:

      #!/usr/bin/perl -w use strict; use WWW::Mechanize::Firefox; my $www=WWW::Mechanize::Firefox->new( #stack_depth=>5, autodie=>1, timeout=>60, #tab=>qr/$tabregex/, bufsize => 50_000_000, ); #$www->events(['load','onload','loaded','DOMFrameContentLoaded','DOMCo +ntentLoaded','error','abort','stop']); print time." get #1\n"; $www->get('http://cmcc.deviantart.com/'); print time." after get #1\n"; print time." content #1\n"; my $con=$www->content(); print time." after content #1\n"; print time." get #2\n"; $www->get('http://cmcc.deviantart.com/#/d1a8l1t'); print time." after get #2\n"; print time." content #2\n"; $con=$www->content(); print time." after content #2\n";

      Note that I had to allow Javascript in the Noscript plugin for deviantart.net, as the second URL uses Javascript to display a single image. Other than that, I get the following (relatively quick) output:

      1291386396 get #1 1291386398 after get #1 1291386398 content #1 1291386398 after content #1 1291386398 get #2 1291386399 after get #2 1291386399 content #2 1291386399 after content #2

      If you are feeling adventurous, you can use the following, changed Javascript to make Firefox display an alert box whenever it captures an event:

      sub _addEventListener { my ($self,$browser,$events) = @_; $events ||= $self->events; $events = [$events] unless ref $events; # This registers multiple events for a one-shot event my $make_semaphore = $self->repl->declare(<<'JS'); function(browser,events) { var lock = {}; lock.busy = 0; var b = browser; var listeners = []; for( var i = 0; i < events.length; i++) { var evname = events[i]; var callback = (function(listeners,evname){ return function(e) { if (! lock.busy) { lock.busy++; lock.event = evname; lock.js_event = {}; lock.js_event.target = e.originalTarget; lock.js_event.type = e.type; alert("Caught first event " + e.type + " " + e.mes +sage); } else { alert("Caught duplicate event " + e.type + " " + e +.message); }; for( var j = 0; j < listeners.length; j++) { b.removeEventListener(listeners[j][0],listeners[j] +[1],true); }; }; })(listeners,evname); listeners.push([evname,callback]); b.addEventListener(evname,callback,true); }; return lock } JS return $make_semaphore->($browser,$events); };
Re^8: WWW::Mechanize::Firefox delayed returns / slow
by tcordes (Novice) on Dec 03, 2010 at 08:39 UTC

    Is there some code I could plunk in to see what firefox events *are* firing through MozRepl? I'm putting in every name of every event I can find on the net hoping to hit the right one.

    I'm going to try to find a public site I can replicate this problem on with a smaller sample program.

      Sorry, but I'm not aware of any "catch-all" way to see a list of all events fired by Firefox (respectively a Firefox window or browser object).