fbicknel has asked for the wisdom of the Perl Monks concerning the following question:

Greetings. So if you do this:
my @foo = @{[`cat very_large_file`]}[0..3];
does Perl temporarily grab enough memory to store very_large_file, then slice the first four lines?

Or does it abandon the cat process once it has what it needs? And if it does this, does it kill it or just leave it running to completion, ignoring its output?

To test this, I devised an experiment replacing cat... with a command that just kept generating output indefinitely:
#!/bin/bash while :; do rand -N 100 done
The result seems to be that Perl keeps gobbling memory as long as the process generates output, then when the process ends it returns the appropriate slice.

Replies are listed 'Best First'.
Re: Slicing the output of a command
by hippo (Archbishop) on Jul 09, 2015 at 16:14 UTC

    Your experiment is a good one and your interpretation of the results is sound. So, if you only want the first few lines either use head instead of cat, or better yet just open the file within perl instead of shelling out and read a few lines that way.

Re: Slicing the output of a command
by Laurent_R (Canon) on Jul 09, 2015 at 17:57 UTC
    Just read as many lines as you need within a
    while (<$fh>) {...
    loop. Only one line at a time will be stored in memory.
Re: Slicing the output of a command
by marioroy (Prior) on Jul 09, 2015 at 18:48 UTC

    Reading files directly in Perl may be more efficient. Otherwise, here is a head demo in Perl for reading output from a binary command and wanting the first count lines. Open may be used to start a command. Note the | after $cmd. Closing the handle stops the command, likely from receiving a PIPE signal.

    This is written mainly to showcase a feature of open. Scripting is lots of fun and TIMTOWTDI. Have fun orchestrating all the tools available to you.

    use strict; use warnings; sub head { my ($cmd, $count) = (shift, shift || 10); my @output; open my $fh, "$cmd |" or die "error: '$cmd' failed"; while (<$fh>) { push @output, $_; last if $. == $count; } close $fh; return @output; } my @foo = head('cat very_large_file', 4); print @foo;
Re: Slicing the output of a command
by KurtSchwind (Chaplain) on Jul 09, 2015 at 19:43 UTC

    As you have already determined, the cat runs to completion. The reason is sort of obvious. Perl doesn't know, in advance, that you only want the first few lines of output. It's going to execute everything between the back-ticks first.

    --
    “For the Present is the point at which time touches eternity.” - CS Lewis
Re: Slicing the output of a command
by TomDLux (Vicar) on Jul 10, 2015 at 02:42 UTC

    Every once in a while, there is some activity which is relatively simple to perform using a shell command, but would be somewhat inconvenient using Perl commands. The down side is that it's generally sloppy.

    If you're going to get the contents of a file, and you're dealing with a very short throw-away script, it might make sense to `cat` it.

    But you're launching a shell (many milliseconds), opening the file (some milliseconds), loading all it's contents into memory, outputting the contents from the sub-process to the Perl script. If you open a file in Perl, it takes the same amount of time as when you do it in a sub-shell, even though you have to open() the file, <read> the contents, just no sub-shell.

    Just about anything you want to do, is available internally in Perl, either built-in, provided by a core module, or availaable as an add-on module. Go to Cpan.org or metacpan.org or search for what you need. For example, gzip-ing or gunzip-ing a file is as simple as use-ing the module, and specifying something in the open() command.

    As Occam said: Entia non sunt multiplicanda praeter necessitatem.

      That is beautiful TomDLux. Thus, a demonstration for the OP.

      my @foo = `cat very_large_file | head -4`;

        Why cat the whole file just to discard all but the first four lines:

        my @foo = `head -4 very_large_file`;

        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!
Re: Slicing the output of a command
by wee (Scribe) on Jul 09, 2015 at 22:21 UTC

    "The result seems to be that Perl keeps gobbling memory as long as the process generates output, then when the process ends it returns the appropriate slice. "

    That it does. Perl doesn't know what the command might do, so it can't take a slice until it returns. And cat reads everything (this is why it's also a bad idea to use something like 'cat file.txt | more' instead of 'less file.txt).

    Also, it's generally not such a good idea to shell out to an external command when you have that same functionality built it.