Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: Unexpected output from fork (Win32)

by BrowserUk (Patriarch)
on Aug 09, 2004 at 11:58 UTC ( [id://381197]=note: print w/replies, xml ) Need Help??


in reply to Unexpected output from fork (Win32)

It comes down to buffering.

The first time you ask perl to read a single line from the input file with <IN>, perl reads a buffer-sized chunk from the file and then gives you back a single line. Subsequent calls to <IN> (for that kid), then give you the next line from the buffer until it is exhausted, at which point perl refills the buffer.

Each of your kids will have its own buffer. So, each kid will process a group of lines (as meany as fill the internal buffer) sequentially, before reading the next buffer load. By the time that happens, each of the other threads have already filled their buffers, so this kid gets a buffer load 10x buffersize further down the file.

As your lines are 16 bytes, and each kid is processing 256 lines per buffer load, that makes the buffer size 4096 bytes. Notionally, the first thread to run will process lines 1..256 then 2561..2817 then 5121..5377 etc.

However, it would be dangeruos to make this assumption as the order in which the kids will be run is non-deterministic. It only appears somewhat deterministic in your example because the sleep 1 is having the effect of tending to serialise them.

If you remove that sleep, you will see a much greater variablilty in the results. Each thread will still tend to process blocks of 256 lines at a time, but the first 2 or 3 kids will tend to process the bulk of the file and the others will process little or nothing.

This effect is not (as is often assumed) a bug in the scheduler. It is simply that without the sleep, the first few kids use their full timeslice and by the time the 3rd or 4th kid is started, the whole file has been processed.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon

Replies are listed 'Best First'.
Re^2: Unexpected output from fork (Win32)
by maa (Pilgrim) on Aug 09, 2004 at 14:03 UTC
    As your lines are 16 bytes, and each kid is processing 256 lines per buffer load, that makes the buffer size 4096 bytes. Notionally, the first thread to run will process lines 1..256 then 2561..2817 then 5121..5377 etc.

    How blindingly obvious :-) Thanks.

    This was only a test program - I put the sleep statement in because the first thread processed all 10_000 without it and the operations that will eventually be in there will certainly take several seconds to complete.

    Once again thanks for a crystal clear explanation! It all makes sense now.

    - Mark

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://381197]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2024-04-23 21:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found