http://qs1969.pair.com?node_id=501419

jfroebe has asked for the wisdom of the Perl Monks concerning the following question:

I've been reading a book on DBMS administration using perl (specific book doesn't matter) and have seen this in other perl books as well. (I'm not singling out any author)

Several books tend to avoid the use of map and grep. I can only assume that the authors have trouble explaining the usage and benefits of these in such a way that perl programmers can understand. How can we explain map and grep better?

One of the benefits is that often, using map or grep provides increased performance (see example below). Another is that you can chain the greps and maps together to make your life so much easier.

I'm convinced that map & grep aren't as heavily used as they could be because people don't understand them.

Example:

#!/bin/perl use strict; use warnings; use Benchmark qw(cmpthese); ## benchmark to determine best method ## of finding common elements in two ## arrays my @array1 = ( '0', '1', '2', '3', '5' ); my @array2 = ( '0', '1', '2', '3', '4', '6' ); cmpthese (100000, { 'using foreach' => sub { my @common; foreach my $element1 (@array1) { foreach my $element2 (@array2) { if ($element1 eq $element2) { push @common, $element1 unless grep { $element +1 eq $_ } @common; } } } }, 'using grep' => sub { my @common; @common = grep { my $element1 = $_; ! grep { $element1 == $_ } @array2 } @array2; } });

Benchmark:

$ ./test_bm_array Rate using foreach using grep using foreach 6873/s -- -30% using grep 9766/s 42% --

Jason L. Froebe

Team Sybase member

No one has seen what you have seen, and until that happens, we're all going to think that you're nuts. - Jack O'Neil, Stargate SG-1

Replies are listed 'Best First'.
Re: question for perl book & magazine authors
by sauoq (Abbot) on Oct 19, 2005 at 21:00 UTC
    I'm convinced that map & grep aren't as heavily used as they could be because people don't understand them.

    I don't think I agree with the implied premise that map() and grep() should be used more heavily. I'd say they are probably used inappropriately at least as often as appropriate uses for them are missed.

    For instance, in the example you gave, you'd probably be better off using a hash to determine common elements.

    Update: You should also be more careful to check your benchmarks for correctness. Your "using grep" benchmark doesn't even reference @array1...

    Update 2: I also just noticed you are using grep in a scalar context in the unless clause in your "using foreach" sub. That's one of those inappropriate uses of grep I mentioned above...

    -sauoq
    "My two cents aren't worth a dime.";
    
      The hash trick is a nice way if the array is small enough. for huge arrays the algorhitmical/runtime trade may be worth the price to save memory.


      holli, /regexed monk/

        Sure... if memory is so tight that you have to choose an O(n * m) time complexity over O(n + m) then I guess you gotta do what you gotta do.

        Of course, in that case (and especially if this is a routine task) you are almost certainly going to win by throwing more RAM at the problem because the larger your dataset the more (time) you are saving with the linear algorithm.

        Update: Also note that the additional memory you need will be relative to the smaller of your two arrays. (As a possible optimization depending, of course, on your data.)

        -sauoq
        "My two cents aren't worth a dime.";
        
Re: question for perl book & magazine authors
by runrig (Abbot) on Oct 19, 2005 at 21:48 UTC
    I'm convinced that map & grep aren't as heavily used as they could be because people don't understand them.
    And because (some) people don't understand them, it's also (sometimes) hard for people to use them correctly and tell at a glance what is going on. Take, for instance, your 'using grep' example. It doesn't work (also, you are using 'eq' in one vs. '==' in the other which may make an unfair benchmark). I assume you are finding common elements but eliminating any duplicates from the output. The grep example isn't eliminating items that have already been seen because it's not keeping track of what's been seen. I think you were aiming more for this (though I assume the O(n**2) algorithm is on purpose just for demonstration purposes):
    'using grep' => sub { my @common; my %seen; @common = grep { my $element1 = $_; !$seen{$element1}++ and grep { $element1 eq $_ } @array2 } @array1; },
    Update: and after the fix, the benchmark doesn't favor grep nearly as much.
      dang typos of mine!

      Thanks! :)

      Jason L. Froebe

      Team Sybase member

      No one has seen what you have seen, and until that happens, we're all going to think that you're nuts. - Jack O'Neil, Stargate SG-1

Re: question for perl book & magazine authors
by holli (Abbot) on Oct 19, 2005 at 20:56 UTC
    I don't think it's hard to understand grep and map. Both take a list from the right and return a list to the left. grep returns only those for which the value of the intermediate expression returns true, map returns the value of the expression itself.

    One of the simplest and most powerful features in perl.


    holli, /regexed monk/
Re: question for perl book & magazine authors
by perrin (Chancellor) on Oct 19, 2005 at 21:30 UTC
    Are you asking why they aren't more widely used? Probably because they aren't as widely understood and often result in code that's harder to read. (Your code has to resort to $_ for example, instead of a more informative variable name.) They definitely have their uses and I use them in my code when they seem to provide a benefit over other loop constructs, but a bunch of chained maps and greps is probably not the first tool to reach for if you're concerned about the legibility of your code.
Re: question for perl book & magazine authors
by VSarkiss (Monsignor) on Oct 19, 2005 at 21:13 UTC
Re: question for perl book & magazine authors
by Tanktalus (Canon) on Oct 20, 2005 at 03:42 UTC

    As to the question of grep and map being used as much as they could, I couldn't agree more. At work, I've had to train a few people to use them instead of foreach when they made sense ... and then untrain them a bit when they started using them where they didn't make sense. The point isn't to religiously worship them, but to make your code do what it says: if you're converting one list into another, you're mapping from one to the other. If you're finding elements in a list, you're grepping for them.

    By speaking idiomatically perl, you may get a speed benefit (as you did in your benchmark), but, more importantly, you get a maintenance benefit: your code looks like a design document!

    That is actually part of the reason why I don't like map's and grep's used in void context - they are no longer reading like plain English. "Map from @common to ... nothing?" Instead, just use for/foreach: "For each element in @common, do..." That reads like exactly the solution you're attempting to do. Which is probably how you're thinking of the problem - and, any time your can code your solution in the domain of the problem instead of the domain of the solution, you're going to end up with clearer and more flexible code.

Re: question for perl book & magazine authors
by tomazos (Deacon) on Oct 19, 2005 at 21:48 UTC
    While we are talking about grep, does anyone know why grep is called grep? I would have thought filter would be a more appropriate name? Anyone know the "etymology" of the word grep?

    -Andrew.

      It comes from the (Unix) grep utility which stands for (g)lobal-(r)egular (e)xpression-(p)rint.
      In vim, the command :g/re/p (where g means "global", re is clearly "regex" and p means "print") will find all lines matching the regex and print them in a list. This was apparently true of vim's predecessors as well, but it's still a valid (and useful) command in vim today. See also Wikipedia.
        Says chester:
        In vim...
        The g/re/p command actually goes back to ed, which was the original Unix editor, back around 1972, and is still provided with all Unix systems. After ed came vi, the visual editor, with command syntax similar to ed's, and then vim, which is an improved version of vi.

        Early versions of Unix also had a gres command (perform a substitution on all matching lines) but it was obsoleted by sed and abandoned.

Re: question for perl book & magazine authors
by saberworks (Curate) on Oct 19, 2005 at 21:31 UTC
    I agree, mostly. I have been programming perl for quite a while and it still takes me longer than it should to pick apart what a grep or map is trying to do. I have gradually been using them more and more, but it also takes me longer to code them than their hash-or-for-based counterparts.
Re: question for perl book & magazine authors
by kwaping (Priest) on Oct 19, 2005 at 23:10 UTC
Re: question for perl book & magazine authors
by QM (Parson) on Oct 21, 2005 at 14:31 UTC
    map and grep are not picked up as quickly by new Perl programmers as for, for several reasons.

    First, new programmers will have difficulty lining up multiple actions all at once. Compare these:

    my @newlist = map { $_*2 } @list; my @list; my @newlist; for my $item( @list) { push @newlist, $item * 2; }
    The first example examines all values of @list (one at a time of course), multiplies each value by 2, and saves all resulting values into @newlist.

    The second example is easier for a newbie to manage. It loops through @list, one at a time. Each time through, $item will hold the active value. Multiply each value by 2, and save it in @newlist.

    Second, it's easier to debug the 2nd example. There are several places to breakpoint or print out intermediate results to check that the code behaves as expected. Printing out intermediate results in a map is also easy, but it has to be done carefully to avoid munging the map return values.

    My point is that a newbies coming from some other languages (those without map or grep) would be more comfortable with the 2nd form. Once entrenched, they are less likely to embrace map and grep without extra incentives.

    -QM
    --
    Quantum Mechanics: The dreams stuff is made of