in reply to When would you choose foreach instead of map?

This whole thread can be summarized in one sentence: The single non-preference-based difference is that 'map' produces the side-effect of assigning to an 'output array' whereas 'foreach' does not. Everything else is personal preference.

TMTOWTDI Strikes again: (!rant warning!)

This thread perfectly demonstrates TMTOWTDI, and how it relates to "personal preference". In fact, what perl-folks like to call "TMTOWTDI", I call "The Law of Personal Preference". Its allegories include:

The number one determining factor to influence code-reuse is personal preference. *Every* reason to choose programming option 'foo' over programming option 'bar' is based on personal preference.
In fact, to ask whether 'foo' or 'bar' is distinguishable outside the realm of personal preference is nearly *certain* to give you emotionally-influenced answers that are based *substantially* on personal preference.

Proof: Rephrase the original question according to the following template:

Is it possible to generate the result 'X' in such a way that I can automatically deduce the method that was used to generate that result (without looking at the source code) by evaluating one or more quantifiable measures of fitness?
... in other words, make up a little 'Turing test' ...
Find a person who can look at the output of a script and determine whether the output came from 'map' or from 'foreach'. They can evaluate everything except the source code itself. (eg, quantifiable measures of fitness, such as how long it takes the script to run, or complile, or whether the output meets a specification, etc. ). Hold all other things equal.
If your 'quantifiable measure of fitness' is *specifically* how many lines of code it takes to do something (which is a valid measure in itself). Then just look at the different ways of doing it and pick which one seems best for you. Absent an indication that one option is inherently less secure, more buggy, less performant, more likely to be deprecated, (etc etc ad nasuem) than the others, there is no reason why you should not use one over the other (other than ... you guessed it ... personal preference).

### TMTOWTDI : ### - SnippetID: tsid="20040520_1649_06239" ### description: multiple ways to loop and munge an array ### details: | ### The following snippet shows a handful of ### alternate coding styles for looping and ### munging an array. use strict; use warnings; my (@array, @outcopy) = (); @array = qw(1 2 3 4 5); ### NOTE: we *dont* care about making an 'output copy' of the array $_ *= 2 foreach @array; print "@array"; print "\n............. \n"; $_ *= 2 for @array; print "@array"; print "\n--------------\n"; map {$_ *= 2 } @array; print "@array"; print "\n............. \n"; for (@array) {$_ *= 2 }; print "@array"; print "\n--------------\n"; foreach (@array) {$_ *= 2 }; print "@array"; print "\n............. \n"; ### NOTE: here we DO care about making an 'output copy' ### notice that both produce same effect, but foreach ### requires additional code because it ### does not produce the side-effect @outcopy = map {$_ *= 2 } @array; print "@outcopy"; print "\n--------------\n"; @outcopy = (); ###<-- additional line foreach (@array) {push @outcopy, $_ *= 2 }; print "@outcopy"; print "\n............. \n";

Replies are listed 'Best First'.
Re: Re: When would you choose foreach instead of map?
by BrowserUk (Patriarch) on May 21, 2004 at 02:13 UTC
    Is it possible to generate the result 'X' in such a way that I can automatically deduce the method that was used to generate that result (without looking at the source code) by evaluating one or more quantifiable measures of fitness?

    Where 'X' is the sum of the numbers 1 .. 1_000_000.

    p:\test>x1 500000500000 p:\test>x2 500000500000

    The programs (in no particular order).

    map{ $sum += $_ } 1 .. 1000000; print $sum;
    and
    $sum += $_ for 1 .. 1000000; print $sum;

    External metric: One of them uses 3MB, the other 96MB :)


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
      I don't know how to check for memory usage (rather than the blunt method of watching the task manager) but here are some speed tests. It appears from this test that they are very simlar in speed for this summing operation. For is the winner but i'm not sure its a noticable portion until you climb into the severl hundered thousand range.
      use strict; use warnings; use Benchmark qw(:all) ; my $to = 10_000; cmpthese(-5, { 'Map100k' => sub { my $sum = 0; map { $sum += $_ } 1 .. 100_000; }, 'For100k' => sub { my $sum = 0; $sum += $_ for 1 .. 100_000; }, 'Map10k' => sub { my $sum = 0; map { $sum += $_ } 1 .. 10_000; }, 'For10k' => sub { my $sum = 0; $sum += $_ for 1 .. 10_000; }, 'Map1k' => sub { my $sum = 0; map { $sum += $_ } 1 .. 1_000; }, 'For1k' => sub { my $sum = 0; $sum += $_ for 1 .. 1_000; }, }); __END__ Rate Map100k For100k Map10k For10k Map1k For1k Map100k 18.6/s -- -19% -90% -92% -99% -99% For100k 23.1/s 24% -- -88% -90% -99% -99% Map10k 187/s 905% 709% -- -22% -91% -92% For10k 239/s 1182% 933% 28% -- -89% -90% Map1k 2077/s 11052% 8880% 1009% 770% -- -13% For1k 2387/s 12715% 10219% 1175% 899% 15% --

      ___________
      Eric Hodges
        Your benchmark isn't comparing for-block vs map. It's comparing the for statement modifier against map. Statements modifiers don't have the overhead of entering and leaving a scope. Of course, in this particular case, the clear winner is one that doesn't loop.
        #!/usr/bin/perl use strict; use warnings; use Benchmark qw /cmpthese/; my $base = 1000; foreach my $mult (1, 10, 100) { my $code; my $max = $mult * $base; { no strict 'refs'; @{"main::array$mult"} = 1 .. $max; } my $map_var = "\$sum${mult}map"; my $for_var = "\$sum${mult}for"; my $mod_var = "\$sum${mult}mod"; my $exp_var = "\$sum${mult}exp"; $code -> {"map${mult}k"} = "$map_var = 0;" . "map {$map_var += \$_} \@array$mult"; $code -> {"for${mult}k"} = "$for_var = 0;" . "for (\@array$mult) {$for_var += \$_}"; $code -> {"mod${mult}k"} = "$mod_var = 0;" . "$mod_var += \$_ for \@array$mult"; $code -> {"exp${mult}k"} = "$exp_var = $max * ($max + 1) / 2;"; cmpthese -1 => $code; print "\n"; no strict 'refs'; die "Unequal\n" unless ${"sum${mult}map"} == ${"sum${mult}for"} && ${"sum${mult}mod"} == ${"sum${mult}exp"} && ${"sum${mult}map"} == ${"sum${mult}exp"}; } __END__ Rate map1k for1k mod1k exp1k map1k 1305/s -- -58% -64% -100% for1k 3140/s 141% -- -12% -100% mod1k 3589/s 175% 14% -- -100% exp1k 6859842/s 525617% 218353% 191047% -- Rate map10k for10k mod10k exp10k map10k 123/s -- -59% -66% -100% for10k 299/s 144% -- -16% -100% mod10k 356/s 190% 19% -- -100% exp10k 7021713/s 5717581% 2347785% 1970572% -- Rate map100k for100k mod100k exp100k map100k 12.3/s -- -60% -64% -100% for100k 30.5/s 148% -- -11% -100% mod100k 34.2/s 179% 12% -- -100% exp100k 5383313/s 43894609% 17663897% 15724842% --

        Abigail

      I thought map() was optimized for use in void context. I'd never use map() in such a case in this as for() is much more elegant (all IMO of course). Can you explain why the map() version takes up so much memory?

        I believe* that the reason is nothing to do with the void context. for is optimised to treat the range operator as an iterator, by which I mean it doesn't generate a list of 1_000_000 scalars, it simply supplies the next incremented value as $_ for each iteration.

        map on the other hand doesn't have this optimisation and so a very large list is generated as the input to map. This is what (briefly) consumes the memory. For short ranges this isn't a problem, but for larger ones, it's worth avoiding.

        * I'm fairly confident that this is the case, but it is acquired knowledge rather than something I can point my finger at the sources and say "There".


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail

      I am gonna go out on a limb and wildly speculate that if one of them uses 96MB and the tests are accurate, it was *map*, based on the whole 'side-effect' thingy. If true, this re-emphasizes the original point that this is the one (non-preference-based) difference.

        As always, don't take my word for it, do your own test. All the (two lines) of the (two lines) of the code is there. The only instrumentation required is a <STDIN>; to prevent the processes from exiting and top/task manager.

        Is this the "'side effect' thingy"? I guess you could classify it that way:)


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail