in reply to Re^4: Stuttering Children
in thread Stuttering Children

Well, DATA is just a buffered filehandle that has read the src file up to the __END__. Some of the file content beyond that point has been read into a buffer, some is still waiting to be read by the OS. When you fork, each process gets it's own copy of the buffer, but shares the filehandle for any remaing data. So you're bound to get odd effects.

Dave.

Replies are listed 'Best First'.
Re^6: Stuttering Children (shared seek)
by tye (Sage) on Sep 23, 2014 at 16:03 UTC

    It is more than that. The fork()ed processes also share their file offsets, and not just initially. For example, add a few sleeps to get:

    seek( DATA, 0, 1 ) if @ARGV; foreach my $i (1..2) { _spawn($i); sleep 1; } sleep 4; sub _spawn { my $id = shift || die "Missing id\n"; my $pid = fork(); defined $pid or die "bad open (pipe/fork): $!\n"; # Have parent/child run their respective code... if ( $pid ) { # PARENT CODE... return; } else { # CHILD CODE... print "CHILD$id: My pid = $$\n" ; while(<DATA>) { print "CHILD$id: $_"; sleep 3 }; exit } } __DATA__ 1: THIS IS YOUR PARENT SPEAKING... 2: THIS IS YOUR PARENT SPEAKING...

    And I get the following output:

    CHILD1: My pid = 31325 CHILD1: 1: THIS IS YOUR PARENT SPEAKING... CHILD2: My pid = 31326 CHILD1: 2: THIS IS YOUR PARENT SPEAKING...

    (Note how each child outputs just one of the two lines.)

    That seek line I added in case one needs to reset the buffering that you talked about. But it isn't needed on my system (which makes me suspect that Perl does the equivalent for the DATA handle, either when it first sets it up or when about to fork; I haven't searched for other evidence of either behavior, though).

    I doubt Solaris is different on this point (I don't recall Advanced Programming in the UNIX Environmnet calling this out as only applying to System V descendents, for example). So my guess would be that both children producing output can be explained by the input lines being buffered in the DATA handle and not being flushed before the children are forked.

    My even wilder guess is that Perl is trying to do that seek line when fork is called, but, due to some quirk of whichever I/O layer Perl gets built with there, the effect is instead like the equivalent sysseek call, resetting the file position but not flushing the input buffer of the DATA file handle.

    So each child reads the two buffered lines from the DATA file handle and then one of them manages to (re)read the two lines from the underlying file descriptor (moving its position so the other child just finds EOF). But I can't explain how that could lead to the repeated output appearing last as in the root node's "TRY2".

    But perhaps some of that might help or just prompt somebody else to figure out what is actually going on.

    - tye        

      ++tye thanks for this post. These are the kinds of responses that keep me coming back here.

      I've redesigned my code to avoid using __DATA__ in forked children, but I still want to know why this is happening and what's going on.

      It's like an itch I can't scratch...

      Your post gave me new things to think about. Maybe others can help out as well, or a flash of insight will hit me and all will become clear. If that happens (fingers-crossed) I'll be sure to post.

      Thanks

      -Craig

        Then you should re-run this type of experiment in your environment while using strace (or truss, or whatever it is called on your version of Solaris) to see what Perl is actually asking of the OS. Also, put more data at the end than Perl will read into buffers (how much that needs to be should become clear after looking at the first strace output).

        - tye