in reply to Re: Unexpected output from fork (Win32)
in thread Unexpected output from fork (Win32)

thanks for your informative reply, Jenda

Yes - I see why my $|++ shouldn't work after reading perlvar again (although adding it did get rid of duplicate entries). I couldn't, however, find any mention of the size of the input buffer used by <> or readline() - they both simply promise to return/read up to the next $/ (or EOF) when evaluated in scalar context. Can you point me to the apt document, please? I (wrongly) assumed that, as the seek pointer is shared that I'd get whole records, but you've disproved that :-)

I tried using an array containing all the input already but that has its own problems when you use fork() - perhaps it's time I tried to use threads; :-) Then I can share the array.

- Mark

Replies are listed 'Best First'.
Re^3: Unexpected output from fork (Win32)
by BrowserUk (Patriarch) on Aug 09, 2004 at 17:51 UTC

    A couple of things.

    First, as Win32 pseudo-forks are threads, you can (apparently) use threads::shared to share an array (or other data) between them:

    From threads::shared POD:


    By default, variables are private to each thread, and each newly created thread gets a private copy of each existing variable. This module allows you to share variables across different threads (and pseudoforks on Win32). It is used together with the threads module.

    Though I admit I've never actually tried this.

    Second. Doing the equivalent of your OP code using threads is much simpler.

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
Re^3: Unexpected output from fork (Win32)
by Jenda (Abbot) on Aug 09, 2004 at 16:27 UTC

    It seems the seek point is shared, but the caches are not. Which IMHO doesn't make sense. Either the handles should be completely separate or they should share the cache.

    I did not mean to share the array. The main thread would read the first tenth of the computer names into an array and fork() off a child, the child would have a copy of the array and would start processing those servers. In the meantime the main thread would empty its copy of the array, read the next tenth and spawn another child. And so forth.

    Of course this means that you will have the complete list of computer names in memory, which may and may not be the best thing to do.

    Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.
       -- Rick Osborne

      There is a good explanation for why the file handles are shared but the buffers are not. The buffering is done by Perl, in the IO layer, so each process gets its own memory and buffers. The low-level read is done by the OS, and it shares the file handle. There is a cache in the OS, but that doesn't have any behavior effect on the processes, other than the read being faster.

      Hi, - I've tried that approach (very similar) already but found that some of the operations would inexplicably fail after about 200 fork()s. I was doing 10 forks() at a time then waitpid()-ing them however Windows NT task manager seemed to think that the perl process was 'leaking' handles tho I couldn't track this leak down.

      The 'threads' count in task manager went up and down exactly as expected as did 'handles' for a short while, then it started to go down by less so the number of open handles started growing.

      I'm never going to need to (realistically) check logs on more than 5000 workstations at once so I could try dividing the total number by 10 and try slicing the array for each fork since I think forking too much might be one of the problems. My alternative is to try ithreads.