in reply to Re: parallel process on remote machines,read results and hanle timeout of those process
in thread parallel process on remote machines,read results and hanle timeout of those process


Thanks for your explaination. It helps me understannd better.
That is why in the script, it reads 1024 bytes each time in a loop for each filehandel. For non-blocking filehandel, it has to be done by this kind of chunk-reading.
And using "while (<FH>)", is more for blocking filehandle: I mean, if "Open" not failed, for sure, you can read all lines from the filehandle.

My third question is about " EAGAIN() and retry'. I didn't undersntand this part of the code:

$hl->{$_}->{retry} = 0; $hl->{$_}->{retries} = 0; my $start = time; my $blocksize = 1024; while (scalar keys %hltodo) { machine: for (keys %hltodo) { my $out = $hl->{$_}->{chld_out}; # begin to read my $bytes_read = -1; while ($bytes_read) { my $buf; my $bytes_read = sysread($out, $buf, $blocksize); if (defined($bytes_read)) { if ($bytes_read == 0) { # eof close($out); last; } else { $hl->{$_}->{data}.= $buf; } } else { if ($! == EAGAIN()) { # retry $hl->{$_}->{retry}++; $hl->{$_}->{retries}++; if ($hl->{$_}->{retry}) { $hl->{$_}->{retry} = 0; next machine; } usleep 10; } else { last; } } } delete $hl->{$_}->{"chld_out"}; delete $hltodo{$_}; } # kill remaining pids if timeout reached if ($opt{timeout} && time > $start + $opt{timeout}) { print STDERR "Timeout for: ", join (" ", keys %hltodo), " +killing ", join (" ", values %hltodo) ; kill 1, values %hltodo; %hltodo = (); } }

When the non-blocking filehandel is blocked dur to what ever the reason,it send$! to EAGAIN, then $hl->{$_}->{retry}++ will be 1, so it goes to " $hl->{$_}->{retry} = 0" and "next machine", ti will never do usleep 10 microsecond? I must miss something for this part?

  • Comment on Re^2: parallel process on remote machines,read results and hanle timeout of those process
  • Download Code

Replies are listed 'Best First'.
Re^3: parallel process on remote machines,read results and hanle timeout of those process
by BrowserUk (Patriarch) on Oct 31, 2014 at 12:55 UTC

    EAGAIN means that whilst there is something available on the socket, hence select has given you it, that at the exact moment you tried to read it, something in the system or tcp stack was busy, and rather than block, it returns EAGAIN and lets you do something else in the mean time before trying again.

    I agree with you that the retry logic in your code snippet is borked. It will only attempt one retry and will never do the usleep. What you choose to do about that is up to you. Personally, I think I'd probably omit the retry logic completely and just do the microsleep and loop back to the select; but you should probably consult someone with more *nix experience than me if that is your platform.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Thanks for your patient. It is really nice of the perl expets here to answer questions.

      Yah, my platform is Linux. Some questions about nfreeze/ Storable

      Here is a perl code to bring back the results to the main program.
      my $results_serialized = nfreeze \%testresults; print $serialized;

      1. What is the advantage of persistent data structure? all data in the same block of memory,fast speed? It is suitable for what kind of needs?
      2. If not use nfreeze, I mean, just use

      return \%testresults
      It is also working?

        1. What is the advantage of persistent data structure? all data in the same block of memory,fast speed? It is suitable for what kind of needs?

        2. If not use nfreeze, I mean, just use

        It is quite hard to answer those questions without seeing their actual use in context.

        On the face of it, it doesn't make a lot of sense to freeze a hash in order to return it from a subroutine.

        The only possible (tentative) clue I can glean from the snippet you've posted comes from the name %testresults. It is possible that the code goes on to compare those results with a pre-frozen, known good results hash; in which case the author might be relying upon doing a binary compare of the frozen hashes rather than having to do a looping, possibly recursive traversal to compare them. If so, it might be cleverly efficient; or just obscurely dangerous.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.