in reply to Robustly piping several processes.

Hope I understand what you said. I did something on windows, as I don't have unix around today, sweet home, but you can easily move this to unix. I tested the following code with:
type a.pl|perl -w a.pl
and it demonstrates how you can make your scripts to follow the behavior of tranditional pipe, and take output from another process as input.
a.pl: @in = <STDIN>; foreach (@in) { print "from perl script: ", $_; }
Update 2: I don't know whether you have control towards cmd1, cmd2, etc. or not. If yes, I would strongly suggest you to use pipe(), to help you connect those handlers into pairs.

Update: Hm...Now I know what you mean, so you want your script to act as the "middle man". Anyway, I still would like to leave my original reply there, as it is still a good answer to "the question I thought it was".

Your approach ALONE sounds good to me, but that while loop can be simplified, and you may borrow the concept from following code:
open(DATA, "<", "ex902.pl"); @in = <DATA>;#read in as array close(DATA); open(DATA1, ">", "data.txt"); print DATA1 @in;#flush the whole array out close(DATA1);
However I am wondering why you are trying to do that. It sounds to me, like you are reinventing something the OS is doing for you. For the project you are working on, is there an alternative solution, which allows you to utilize your OS as much as possible? Well, any way, I believe that you must have a good reason to do this.

Replies are listed 'Best First'.
Re: Re: Robustly piping several processes.
by BazB (Priest) on Dec 25, 2002 at 23:43 UTC

    I think you've misunderstood what I'm after.

    I want to execute multiple commands, piped together, but with error checking, STDERR capture, and the like, from within my scripts, hence the comments about the IPC:: modules and piped open()s, not make single scripts behave nicely as part of a pipeline.

    Cheers.

    BazB.

    Update to pg's update: the suggestion to read the output from the first command into an array will not work.
    That data could be upto ~30 million lines or ~50Gb.
    That's the whole point of pipes - you don't have to be able to store the whole dataset in memory, nor waste time writing intermediate stages to disk.

    The potential size of the input is also why I use while loops in my current code, although read() would probably be more efficient, since the data consists of fixed-length records.

    Doing this in the shell directly might be easier, but the benefits of using Perl for building the application are more of an issue.

    Update 2: Ah ha! Me thinks pg(++!) has cracked it.
    pipe() seems to be the way to go. I'm rather surprised that I'd not come across it before.

      BazB and I had some very interesting discussions on and off, in the chat room, and through MessageBox, now we come to an agreement that, the pipe() function introduced in perlipc would be a good solution to connect IO handlers between processes.

      He suggested me to post a reply to complete this thread, and I am doing so now. A piece of sample attached:
      use IO::Handle; use strict; $| ++; pipe(PARENT_READER, CHILD_WRITER); pipe(CHILD_READER, PARENT_WRITER); PARENT_WRITER->autoflush(1); CHILD_WRITER->autoflush(1); my $pid; if ($pid = fork()) { close(CHILD_READER); close(CHILD_WRITER); my $buffer; print PARENT_WRITER 1; while (1) { sysread(PARENT_READER, $buffer, 100); print "parent revd: $buffer, and reply with ", $buffer + 1, "\ +n"; sleep(1); print PARENT_WRITER $buffer + 1; } } else { close(PARENT_READER); close(PARENT_WRITER); my $buffer; while (1) { sysread(CHILD_READER, $buffer, 1000); print "child revd: $buffer, and reply with ", $buffer + 1, "\n +"; sleep(1); print CHILD_WRITER $buffer + 1; } }
      But if you want your script to look through the output while it's between the two processes to check for errors and such, you do need the loop with the read/write. If point each process at one end of a pipe you created, it will go on without your involvement. Maybe you can use the efficient pipe for stdout/stdin but still monitor stderr within the script.

      I wrote a filter to act as a sniffer for a specific network protocol. I wrote code in Win32 to use IO completion ports to efficiently move the buffer from one to the other without copying it, and also passed it to an embedded Perl interpreter as a "tee" in the middle. The Perl could do its fine job of parsing the stuff and presenting it to me, but the processes piped efficiently as long as I didn't select the option for modification of the data stream by the Perl code (that is, report only). I used the modification ability to introduce errors or otherwise test things that my well-behaved program never did.

      —John