abhijithtk has asked for the wisdom of the Perl Monks concerning the following question:

Hello All..

I just have a couple of questions and will be glad to read your responses. Detailed explanations of what exactly is happening/going on will be appreciated.

Il post a simple piece of code and then the question.
#!/usr/bin/perl use strict; use warnings; my $x = 'ls -l'; open(OP,"$x |"); while(<OP>) { print; } close OP;

Another way to execute a command is using backticks.

$x = `ls -l`;

When we use the open method to execute a command(assuming the command produces a lot of data), apparently since the output is read line by line less memory is used up. (As opposed to reading the entire output into a scalar while using backticks).

My question is how are these 2 methods different.

Since even when we use the open method, isnt the entire output being stored somewhere and then being fetched to be printed out line by line?? If thats the case dont they use up the same amount of memory?

Can anyone explain?

Replies are listed 'Best First'.
Re: Executing Commands with "open"
by ikegami (Patriarch) on Jun 14, 2010 at 15:56 UTC

    Since even when we use the open method, isnt the entire output being stored somewhere and then being fetched to be printed out line by line?? If thats the case dont they use up the same amount of memory?

    No, the data isn't stored anywhere.

    In both case, the data is passed to Perl via a pipe (64k buffer?). The backticks empties the pipe into a scalar (or scalars in list context), where as the open method provides you a handle to the pipe so you can empty it yourself. The child will block if it tries to write to a full pipe.

    Note that this form of open returns a pid, and you should call waitpid($pid, 0); to release the resource used by the child.

      Where is this buffer?? Doesnt it count as memory? or is it a physical memory location..?

      So according to what you said the output is stored in this buffer. Backticks empty them into a scalar all at once.

      Incase of open, on calling the open statement, the command gets executed and the output is stored in the buffer, and the filehandle is for the buffer?? :)

      Have i understood it correct?
        Where is this buffer?
        Handled by the kernel.
        Doesnt it count as memory?
        It does. Just not Perl memory (which means, you can store a lot more data for the same amount of memory ;-))
        So according to what you said the output is stored in this buffer. Backticks empty them into a scalar all at once.
        Yes, which gives another difference (and, IMO, the most important difference) between a pipe read and backticks: with backticks, the entire output is collected first - that is, the called program has to finish first before the Perl program can continue; with a pipe, the Perl program can do something as soon as a line of data becomes available.
        Incase of open, on calling the open statement, the command gets executed and the output is stored in the buffer, and the filehandle is for the buffer?? :)
        Sort of. Both the command and the Perl program can run in parallel. And the handle you're getting isn't quite the same as a handle you're getting when opening a file. You cannot seek() a buffer, for instance.

        Where is this buffer??

        It's part of the system file handle.

        Doesnt it count as memory?

        It holds data, so it's some form of memory by definition.

        But it's a rather small (64k?), fixed-size buffer.

        So according to what you said the output is stored in this buffer. Backticks empty them into a scalar all at once.

        No. That would require the output of the child to fit in the buffer, but that's impossible since it's a fixed-size buffer. Backticks repeatedly and continually empty the pipe into a scalar. Backticks is more or less equivalent to

        my $pid = open(my $fh, '-|', $cmd); my $scalar = ''; 1 while sysread($fh, $scalar, BLK_SIZE, length($scalar)); waitpid($pid, 0); return $scalar;

        The scalar keep growing and growing.

        Update: Added the first two answers.

Re: Executing Commands with "open"
by cdarke (Prior) on Jun 14, 2010 at 17:37 UTC
    The size of the buffer, and the way that anonymous pipes are handled in detail, is dependant on the operating system in use, even assuming UNIX (since your example is ls -l). For example, Linux anonymous pipes have an (invisible) inode generated and are handled by the file system, but not all UNIXs behave in that way.

    The size of guaranteed atomic writes is represented by the POSIX constant PIPE_BUF, and it can be retrieved for a given system using the pathconf() or fpathconf() functions in C. It is usual for the value to vary between 1024 and 5120 bytes. In any case it must be at least 512 bytes. However, many applications using, for example, stdio (which perl uses by default) will buffer by text lines, the newline character representing the end-of-record.

    Think of using an anonymous pipe as a form of file IO. I'm sure you realise that writing to a file is normally done in units of one buffer (and that is a gross simplification), pipes are handled in a similar way.