zacc has asked for the wisdom of the Perl Monks concerning the following question:

Am using Parallel::ForkManager using ActiveState Perl 5.10.1 (1006) on Vista Home Premium SP1 and have been getting something rather strange happen.

When initiated in "debug"

my $pm = new Parallel::ForkManager(0);

the code runs in about 11 seconds to run - but if I tell it to run any number of instances (including 1)

my $pm = new Parallel::ForkManager(1); my $pm = new Parallel::ForkManager(5);

then the run time INCREASES ("1" takes seconds and "5" takes many hours to run) rather than the expected decrease (given that I'm running the code on a quad core system doing naff all otherwise ... and the Windows Performance Monitor suggests that the other cores come alive as expected).

Which suggests that I'm doing summat odd - but I've tried to implement code that is based on reference model provided in the ForkManager documentation.

sub scoring { return 1; } my @results = (); $pm->run_on_finish( sub { my ($pid, $exit_code, $ident ) = @_; $results[$ident] = $exit_code; } ); for ( my $i = 0; $i < 1000000; $i++ ) { my $pid = $pm->start($i) and next; my $result = scoring(); $pm->finish($result); } $pm->wait_all_children;

This is cut down PoC code - the real version does much more processing in the child, but the runtime issue is the same... massively increased run-time when actually forking.

Any ideas ?

Replies are listed 'Best First'.
Re: ForkManager Running Real Slow
by BrowserUk (Patriarch) on Sep 03, 2009 at 22:36 UTC

    Not surprising really. You are starting & reaping a million threads, each of which call a sub that returns a constant and then dies. The cost of starting and reaping each thread is about 1000 times the cost of calling that sub. The more concurrent thread you run, the more competition there is for system resources (mostly memory), so the longer it takes to start and reap each thread.

    For want of a better anology, its like having a million navvies try to dig a trench, whilst sharing one shovel.

    Perhaps a better anology: a million miners dig a one-man wide tunnel, one shovel load at a time. The more you send into the tunnel, the harder it is for each man to get to the digface and get his one shovel full.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      In this case, I think he has four shovels available (quad core). The mosh pit mine analogy still holds however. Throw in a long, slow, creaky elevator ride for the start/reap of miner threads. Shoving a million people into the mine and bringing them back out takes far longer than it does to actually swing the four shovels a quarter million times each.

      The code should probably be refactored into four guys sharing the four shovels, with a list of places to dig next after they finish their current one, and only take the expensive elevator back up to the surface when everything is done. If there is a lot of I/O paperwork to do before swinging the shovel each time, then you could use three or four miners per shovel; most of them won't need the shovel at any one time.

        Indeed. Of course, to have four men at the face requires the tunnel be 4 times as wide, and that slows forward progress somewhat. But, if you factor iin the reduction in the time you spend swapping people, there is a net gain. Done right it a 3x+ net gain.

        And if you have a 4-wide tunnel and 8 shovels, although only 4 can dig, you save time swapping shovels. The next shift take the previous shift shovels while the current shift get straight to digging.

        Hm. Did we just stretch an anology? :)


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: ForkManager Running Real Slow
by zacc (Novice) on Sep 04, 2009 at 17:18 UTC
    So, lemme get this right ... If I need to dig a hole, and have "n" spades available - you guys are the people to talk to about how fast I could dig half a hole...... with half as many, or twice as many spades as I have miners (ignoring elevator rides up or down to the coal-face).

    But if I need to run a million very short lived calculations, then I shouldn't be so dumb as to thinks that the time saves spawning and reaping processes would help performance any...

    Just as well you don't charge by the analogy, isn't it!

    Thanks

      Just as well you don't charge by the analogy, isn't it!

      No analogy. Just the facts:

      use Time::HiRes qw[ time ];; $start = time; scoring() for 1 .. 1e6; printf "Took: %.15f / iteration\n", ( time() - $start ) / 1e6;; Took: 0.000000280799866 / iteration $start = time; async{ return 1 }->join for 1.. 10; printf "took: %.15f / iteration\n", ( time() - $start ) / 10;; took: 0.011679911613464 / iteration print 0.011679911613464 / 0.000000280799866;; 41595.1466781113

      Seems I under-estimated the differential a tad.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.