nefigah has asked for the wisdom of the Perl Monks concerning the following question:

Hello! I've started learning Perl this weekend (read Learning Perl and am making my way through Intermediate Perl now), and was quite drawn in by what seems to be a very strong community. The concept of this site makes me warm and fuzzy :)

I'd like to learn Perl in "the Perl way," as I know such styles exist for a reason and help to improve readability among other programmers of the language. So comments about style in addition to actual functionality would be appreciated.

I have a general question, involving the optional nature of parentheses, and an example of said question has arisen in this little exercise from the book that I'm doing.
"Write a program that takes a list of filenames on the command line and uses grep to select the ones whose size in bytes is less than 1000. Use map to transform the strings in this list, putting four space characters in front of each and a newline character after. Print the resulting list." (I increased the number to 10000 to easier accommodate files I had sitting on my desktop for testing purposes.)
Here's what I have:
use warnings; use strict; my @small_files = grep( (-s) < 10000, @ARGV); my @output = map s/(.*)/ $1\n/, @small_files; print @output;

First question is, of course: my isn't it working like I want? The grep seems to be fine, but the @output seems to contain 4 1's instead of 4 modified strings of filenames.
Second question: when I first wrote it, my grep statement looked like: my @small_files = grep -s < 10000, @ARGV;
But that had an error, so I had to put the parentheses in, which made me kinda sad. Why was that the case?
Thanks!
~J. (Just Another Perl Newbie?)

Replies are listed 'Best First'.
Re: Newbie: parentheses, map, etc.
by kyle (Abbot) on Mar 04, 2008 at 03:40 UTC

    It's not working because s/// returns the number of replacements that it did, not the string after replacement. If you want that, you can say map { s/.../.../; $_ } @stuff. Note, however, that this modifies the original list. Since you're modifying the values in place anyway, you might as well use a for loop instead.

    s/(.*)/ $1\n/ for @small_files;

    The error you get without the parentheses on grep is (at least when I tried it):

    Warning: Use of "-s" without parentheses is ambiguous Unterminated <> operator

    You can disambiguate -s by putting $_ in the expression explicitly.

    my @small_files = grep -s $_ < 10_000, @ARGV;

    Otherwise, it thinks that you were going to say something like -s <STDIN> but forgot the closing >.

      Otherwise, it thinks that you were going to say something like -s <STDIN> but forgot the closing >.

      Ah, okay, that makes sense.

      I also understand where I was in error with the regex solution. Funny that I could have just put "    $_\n" :) I guess I didn't think of it because when I think map, I think some sort of function call or active statement should be going on, so the basic string literal eluded me.

      So not explicitly writing the $_ is generally bad? :(

      Thanks everyone!

        I more or less agree with a rule of thumb for $_ in Perl. If you have to use $_ explicitly, it might be better to use something else. I make exceptions for map and grep, though. I pretty much always expect $_ to appear there. In this specific case, I don't think it hurts anything to stick in an explicit $_ (but I'd probably like it better without). You could achieve that, if you really want, by reversing the test: grep 10_000 > -s, @ARGV.

        All just my opinions, of course.

Re: Newbie: parentheses, map, etc.
by Narveson (Chaplain) on Mar 04, 2008 at 03:29 UTC
    my @output = map s/(.*)/    $1\n/, @small_files;

    The substitution operator s returns the number of captured substrings, which is 1. It also alters the string it is bound to ($_ by default, as you are aware), but this alteration is by side-effect. Check the contents of @small_files immediately after this statement.

    What you want is

    my @output = map "    $_\n", @small_files;

    Updated: I am taking the liberty of deleting the false part of my original answer (which is accurately quoted and accurately corrected in kyle's reply below). Thanks, kyle. What was I thinking?

      The substitution operator s returns the list of captured substrings or, in scalar context, the number of captured substrings, which is 1.

      This isn't true. s/// always returns the number of substitutions made or the empty string if no substitutions were made (see perlop). The expression in map is in list context.

        The documentation lies. On failure, it returns the canonical false value, which is an empty string in string context and 0 in numeric context.
Re: Newbie: parentheses, map, etc.
by CountZero (Bishop) on Mar 04, 2008 at 06:49 UTC
    Everyting together in one go:

    print map {"    $_\n"} grep {10000 > -s and -f} <c:/data/*>;

    Note that I included the -f test to make sure I only get files and not directories.

    1. <c:/data/*> grabs all files and directories in directory c:/data/ and returns it as a list
    2. grep {10000 > -s and -f} reduces this list to a list of files less than 10,000 bytes and returns that list
    3. Finally map {"    $_\n"} transforms the list returned by grep and hands it over to print
    And no parentheses used at all!

    As you see you can safely stack map and grep commands and as each works on the list to the right of it, you have to read this construct from right to left to get it in execution order.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: Newbie: parentheses, map, etc.
by Narveson (Chaplain) on Mar 04, 2008 at 03:57 UTC

    Second question.

    Why was that the case?

    Why did you have to put the parentheses in? Or why did that make you sad?

    If it made you sad because you felt there should be a way of writing it without parentheses, then your instincts are sound.

    The trouble, as theDamian explains in Perl Best Practices, "Mapping and Grepping", is that

    when the first argument to a map or grep is specified as an expression, it becomes harder to distinguish from the remaining arguments.

    His recommendation is to say

    map BLOCK LIST

    which in your case would be

    my @small_files = grep { -s < 10000 } @ARGV;
      The suggestion:
      my @small_files = grep { -s < 10000 } @ARGV;

      looked good to me, and turned out to be basically what the book gave as the answer as well... except I tried it myself and it doesn't compile. oO
      Gives the same error as my original version. (Unterminated <>)

      kyle's suggestion seems to work great though (reversing -s and 10000).
Re: Newbie: parentheses, map, etc.
by parv (Parson) on Mar 04, 2008 at 03:38 UTC

    Best (that word again) is to not rely on implicit variables as you are with use of -s function. So pass the file name in $_ to -s: @p = grep { -s $_ < 1 } @q.

    (Use of regular expression is overkill just to put simple prefix & suffix around each list element (UURE).)

      That's certainly a matter of opinion. $_ is the default for many builtins, and I, at least, feel free to take advantage of that for convenience, brevity, and readability.
        No arguments here. If -s < $number worked as is, I would not have replied.
Re: Newbie: parentheses, map, etc.
by ikegami (Patriarch) on Mar 04, 2008 at 07:37 UTC
    For functions that modify $_ rather than return the transformed value, there's Filter in Algorithm::Loops and apply in List::MoreUtils. IIRC, both functions are identical. Unfortunately, neither module is Core.