tcordes has asked for the wisdom of the Perl Monks concerning the following question:

I'm having some problems with WWW::Mechanize::Firefox.

The first problem is it fails most tests when installing. I got it to install with force and it seems to work for what I'm doing. I'll give details of the tests at bottom.

The big problem I'm having is that calls seem to take a long time to return. I'm simply doing a get() of a login page (immediately shows in firefox but doesn't return for about 30secs); then a filling in of the login page and a click to submit (fills in and goes to next page in firefox nearly immediately). Then the click() doesn't return for 145secs. Eventually it does return and my script continues. After that I do a get() on a known url on the same site. and it shows up in firefox immediately but takes precisely 30 secs to return. After that, I do a save_url() on an image and another get() on a similar page and the last get hangs forever (I've left it at least 15 mins).

The site I am accessing is extremely js-laden, with all sorts of js happening at all times I'm sure. However, firefox is showing the page immediately and reporting "done" in the status bar (ie: no ongoing loading).

So what's causing the delay? I looked through all docs and googles for details. I tried queue=>0/1 to no avail. I tried timeout=><smaller numbers> to no avail.

I made a small test case using my own web site with tiny static pages with no js in them and the return delay is about 5-10s per page, which is acceptable.

The strangest part is, when it's waiting to return, my cpu load is low. If it's not waiting on I/O or waiting on heavy calculations (tree traversals, etc) then what is it waiting for?

My cpan install errors are like below:

t/01-mech-destroy.t ............... ok t/48-mech-launch.t ................ ok t/49-mech-get-file.t .............. 1/7 # Loading /root/.cpan/build/WW +W-Mechanize-Firefox-0.40-fKA2yZ/t/49-mech-get-file.html MozRepl::RemoteObject: [Exception... "Component returned failure code: + 0x80520015 (NS_ERROR_FILE_ACCESS_DENIED) [nsIWebNavigation.loadURI]" + nsresult: "0x80520015 (NS_ERROR_FILE_ACCESS_DENIED)" location: "JS + frame :: chrome://global/content/bindings/browser.xml :: loadURIWith +Flags :: line 186" data: no] at /root/.cpan/build/WWW-Mechanize-Fire +fox-0.40-fKA2yZ/blib/lib/WWW/Mechanize/Firefox.pm line 623 # Looks like you planned 7 tests but ran 1. # Looks like your test exited with 255 just after 1. t/49-mech-get-file.t .............. Dubious, test returned 255 (wstat +65280, 0xff00)

Yes, my MozRepl was started in the browser beforehand. It works fine without any delays from the mozrepl telnet direct command line. No, I don't have any blockers at all intalled in firefox, except AdblockPlus which is only set to block a few ads on ebay, never anything else. My firefox setup is almost extension/addon free. I can't see anything I would be doing to cause NS_ERROR_FILE_ACCESS_DENIED; I have a pretty stock Firefox setup.

When it's doing the tests, I can see it causing firefox to do things. Some tests take many minutes to run, appearing hung.

My versions are

fedora 12 linux kernel 2.6.32.16-141.fc12.i686.PAE core2quad, 8GB RAM firefox-3.5.15-1.fc12.i686 perl-5.10.0-96.fc12.i686 MozRepl: latest from git as of 2010-12-01 (tried both stable and snaps +hot) WWW::Mechanize::Firefox latest from cpan as of 2010-12-01 WWW::Mechanize latest from cpan as of 2010-12-01 dependencies: # I try to install from Fedora yum rpms when possible js-devel-1.70-8.fc12.i686 perl-parent-0.223-1.fc12.noarch perl-Class-C3-XS-0.13-1.fc12.i686 perl-Algorithm-C3-0.08-2.fc12.noarch perl-Class-C3-0.22-1.fc12.noarch perl-MRO-Compat-0.11-2.fc12.noarch perl-Text-SimpleTable-2.0-2.fc12.noarch perl-Carp-Clan-6.00-5.fc12.noarch perl-IPC-Run-0.84-1.fc12.noarch perl-Data-Dump-1.15-1.fc12.noarch perl-JSON-2.15-5.fc12.noarch perl-Net-Telnet-3.03-9.fc12.noarch perl-YAML-Tiny-1.40-1.fc12.noarch perl-PAR-Dist-0.46-1.fc12.noarch perl-File-Remove-1.42-3.fc12.noarch perl-Module-CoreList-2.17-96.fc12.i686 1:perl-ExtUtils-CBuilder-0.24-96.fc12.i686 1:perl-Module-Build-0.3200-96.fc12.i686 perl-Module-ScanDeps-0.95-1.fc12.noarch perl-Module-Install-0.91-2.fc12.noarch perl-Spiffy-0.30-11.fc12.noarch perl-Algorithm-Diff-1.1902-8.fc12.noarch perl-Text-Diff-1.37-2.fc12.noarch perl-Test-Base-0.58-2.fc12.noarch perl-CPAN-DistnameInfo-0.09-1.fc12.noarch Non-Fedora-rpm cpan dependencies all latest as of 2010-12-01
Thanks graciously!!

Replies are listed 'Best First'.
Re: WWW::Mechanize::Firefox delayed returns / slow
by Corion (Patriarch) on Dec 01, 2010 at 15:04 UTC

    The failures really look like as if Firefox fails to load local files. This is very weird.

    The Perl program is likely waiting for Firefox/Javascript to send the event that the page has loaded completely. Why this is slow for you, I don't know. What version of Firefox are you running?

    I'm not sure what is causing the delay for you. I think that maybe in upcoming versions, I'll add more debugging features that output more diagnostics.

      I solved the local file problem. It was indeed just a matter of the tests running as root in cpan cmd line while my firefox was running as non-root. I gave my non-root user access to the files and now the tests are running much better.

      Here are the only failures:

      t/50-app-new.t .................... 6/8 # Failed test 'The name is "Standard"' # at t/50-app-new.t line 46. # got: 'Default' # expected: 'Standard' # Looks like you failed 1 test of 8. t/50-app-new.t .................... Dubious, test returned 1 (wstat 25 +6, 0x100) Failed 1/8 subtests t/50-mech-bufsize.t ............... 4/8

      It hangs on the last one at 4/8. I waited 15 mins, but it's hung. The hang seems very similar to the hangs my program is having. I added some debugs to that test and tried again. It's hanging forever on the mech->content_as_png call. Commenting out that call I can get the tests to continue.

      t/70-urlbar.t ..................... 1/2 # Failed test 'We changed the location once' # at t/70-urlbar.t line 48. # got: '0' # expected: '1' Use of uninitialized value $this in pattern match (m//) at t/70-urlbar +.t line 49, <DATA> line 1. # Failed test '... to that other page' # at t/70-urlbar.t line 49. # undef # doesn't match '(?-xism:/70-urlbar-2.html$)' # Looks like you failed 2 tests of 2. t/70-urlbar.t ..................... Dubious, test returned 2 (wstat 51 +2, 0x200) Failed 2/2 subtests

      Also, I get a lot (dozen+) of these warnings:

      Subroutine MozRepl::__load_plugins redefined at /usr/lib/perl5/site_pe +rl/5.10.0/Module/Pluggable/Fast.pm line 104, <DATA> line 1.

      And I had to turn off my proxy settings ("direct connection") to make a few tests pass (my squid seems to return things other than what the tests expect).

        t/50-app-new.t .................... 6/8 # Failed test 'The name is "Standard"' # at t/50-app-new.t line 46. # got: 'Default' # expected: 'Standard' # Looks like you failed 1 test of 8.

        Ouch - the name of the "Default" theme in Firefox is localized. I have to take care of that in my tests, likely by not testing for the name but the UUID of the theme.

        t/50-mech-bufsize.t ............... 4/8

        If that test hangs, this is bad for large webpages. It loads an image-heavy deviantart.com page, so that should work in any case. Maybe this is an issue with your proxy setup, but without knowing more about it, that'll be hard to replicate.

        It would be nice if you could tell me what tests fail (and how) with your proxy, as that would help me make the tests (and maybe the module) more resilient against the proxy results.

        t/70-urlbar.t ..................... 1/2 # Failed test 'We changed the location once' # at t/70-urlbar.t line 48. # got: '0' # expected: '1' Use of uninitialized value $this in pattern match (m//) at t/70-urlbar +.t line 49, <DATA> line 1. # Failed test '... to that other page' # at t/70-urlbar.t line 49. # undef # doesn't match '(?-xism:/70-urlbar-2.html$)' # Looks like you failed 2 tests of 2. t/70-urlbar.t ..................... Dubious, test returned 2 (wstat 51 +2, 0x200) Failed 2/2 subtests

        These test failures are weird, as they also seem to indicate that the script does not receive some Firefox events. I'm not sure what to make of that.

      Hi! My firefox is 3.5.15, the latest rpm for Fedora 12. I will be switching to F14 in the not-too-distant future and I'm sure that will update me to 3.6. Maybe that will help.

      What local files would MozRepl be trying to load? My firefox is already open and it works fine/normal while browsing hundreds of pages all day long. Hmmm, perhaps because I'm doing the cpan install as root but the firefox is running as my non-root user? I'll investigate that.

      I think your guess about waiting is correct. Like I said, things always load immediately when a mech call is made, but the calls then sit there for a long time before returning.

      Further tests on simple sites indicate that it probably is just this over-the-top js site I am trying to mech that is causing the problems. The simple sites seem to mech just fine.

      Perhaps a "max_wait" option could be added to force mech call returns after X seconds even if they think they still need to wait. Something like "return with what you have at the moment" shortcut.

      I'm pretty good with perl and I could certainly help with coding and testing if I get some hints on where to look. The module and its zillion dependencies taken together are quite a daunting pile for me at the moment.

      Success! I debugged and hacked some of the pm code and got the mech to go without hangs or overly long pauses.

      What I did is just a kludge, but it could hint at the real solution.

      I found that the get() was calling synchronize(). Synchronize would be slow (5-20 secs) on the $callback->(). The big problem though is after get()ing a couple of pages the subsequent _wait_while_busy($load_lock) hangs forever.

      So in _wait_while_busy I added before the while loop: my $start=time; and added at the end of the for loop: return $element if time-$start>5;

      Now my program seems to work as expected! Hooray!

      Obviously we're waiting on events that never happen. Again, some global timeout might help. Or, a rejig of this code if this is a true bug (and not just a quirk of the site I'm crawling).

      I can continue to help as required. Thanks!

        ->synchronize waits on the appropriate event to fire in Firefox. If it takes longer than you expect, most likely you're waiting for the wrong event. See the ->events method and the events => argument to the constructor on how to define your own events.

        Just adding a timeout in ->_wait_while_busy only papers over the fact that you're not receiving (or rather, listening to) the right event.

        Now, finding the right event to listen to requires some ingenuity, as the default events (DOMFrameContentLoaded, DOMContentLoaded, error, abort, stop) don't seem to fire "soon enough" in your case. Maybe load is another good event to listen to, but it fires before subframes have loaded.