in reply to Re^2: ForkManager Running Real Slow
in thread ForkManager Running Real Slow

Indeed. Of course, to have four men at the face requires the tunnel be 4 times as wide, and that slows forward progress somewhat. But, if you factor iin the reduction in the time you spend swapping people, there is a net gain. Done right it a 3x+ net gain.

And if you have a 4-wide tunnel and 8 shovels, although only 4 can dig, you save time swapping shovels. The next shift take the previous shift shovels while the current shift get straight to digging.

Hm. Did we just stretch an anology? :)


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
RIP PCW It is as I've been saying!(Audio until 20090817)

Replies are listed 'Best First'.
Re^4: ForkManager Running Real Slow
by SuicideJunkie (Vicar) on Sep 04, 2009 at 14:34 UTC

    Well, the task does need to be massively parallelizable, since in theory it is possible to have a million threads working simultaneously. I suppose it isn't so much a tunnel as a sprawling mine webbed with shafts. But still only 4 shovels to go around.

    • Worker = thread
    • Shovel = CPU core
    • Passing shovel between workers = CPU context switch
    • Designated dig area/volume= combined list of thread tasks, order generally unimportant
    • Elevator (requires a shovel to operate the controls) = Spawn/kill threads
    • Boss leaning on a shovel for photo ops = CPU wasted on refreshing your desktop
    I'm not sure what RAM and swapping to disk would correspond to.
      SuicideJunkie:

      When you've got too many tasks that need to be moved forward asynchronously, you might want to use a state-machine architecture.

      One example: have perhaps a couple of threads per CPU. Each of them has their own task list. Then the thread just loops over the tasks, moving them ahead as they can. Whenever you're adding a task to the mix, just add it to the shortest task list. You might also have an event handler that when it sees particular events occur, it updates the information for the appropriate task. Then the thread can do any processing required the next time it examines the task.

      In the tunnel-digging analogy, you might have a dozen people digging a four-person wide hole. Each one has a shovel and a bucket. One task would be to dig out a scoop of dirt and toss it into the bucket. Another task might be to carry full buckets out to the dumping area. So you've got a million "dig a scoop" jobs.

      The first task(person) scoops out some dirt and tosses it in the bucket. Then he looks at the next task on the list: carry a full bucket to the dumping area. His current bucket isn't full, so that task stays on the list and he goes to the next "dig a scoop" tasks. He digs another scoop of dirt and puts it into the bucket. Eventually, he'll get to a "carry the bucket" task when his bucket is full, so he'll pick up the bucket and carry it to the dump. In the meantime, since a slot has opened up, one of the other threads waiting patiently for a place at the front of the line walks up and starts processing.

      Alternatively, you might be slicing your tasks up too much. You might need to make your tasks larger so the thread can do more work without worrying about context switching. Using the tunnel analogy again, let's say there's only one task: "Dig as much as you can, fill the bucket, and carry the dirt to the dump". In this case, the person isn't cycling through the tiny tasks. Rather he'd have hundreds of tasks like:

      while (!bucket_is_full()) { my $dirt = dig_a_scoop(); toss_in_bucket($dirt); } move_to($dump); empty_bucket();
      ...roboticus