Speed comparison of foreach vs grep + map

mldvx4 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Speed comparison of foreach vs grep + map by Corion (Patriarch) on May 25, 2025 at 18:01 UTC
Without looking at your timings, the second script is doing something different than the first script: `my @e = grep(/0/, @d);` [download] This invokes the regex engine instead of comparing `$_` to falsy values. Maybe you meant to write something like: `my @e = grep(!$_, @d);` [download]	[reply] [d/l] [select]
Re: Speed comparison of foreach vs grep + map by ikegami (Patriarch) on May 25, 2025 at 18:28 UTC
As Corion pointed out, the switch from `!$_` to `/0/` is overpowering the other differences and invalidating your test. But other improvements can be made. There's no reason to use an intermediate array in the latter snippet. This avoids the creation of 5 million scalars. `my @e = grep !$_, map $_ % 2, @c;` [download] You could even eliminate the `grep`. `my @e = map $_ % 2 ? () : 0, @c;` [download] Similar improvements can be made for the foreach version. `my @e; for my $c ( @c ) { my $d = $c % 2; push @e, $d if !$d; }` [download] `my @e; for my $c ( @c ) { push @e, 0 if !( $c % 2 ); }` [download] Finally, I'm curious how these compare: `my @e = ( 0 ) x grep !( $_ % 2 ), @c;` [download] `my @e; push @e, 0 for 1 .. grep !( $_ % 2 ), @c;` [download]	[reply] [d/l] [select]
Re^2: Speed comparison of foreach vs grep + map by ikegami (Patriarch) on May 29, 2025 at 16:48 UTC
Benchmarks: `#!/usr/bin/perl use strict; use warnings; use Benchmark qw( cmpthese ); my @c = 1 .. 1_000_000; cmpthese( -3, { for => sub { my @e; for my $c ( @c ) { my $d = $c % 2; push @e, $d if !$d; } }, for2 => sub { my @e; for my $c ( @c ) { push @e, 0 if !( $c % 2 ); } }, map_grep => sub { my @e = grep !$_, map $_ % 2, @c; }, map => sub { my @e = map $_ % 2 ? () : 0, @c; }, grep_x => sub { my @e = ( 0 ) x grep !( $_ % 2 ), @c; }, grep_for => sub { my @e; push @e, 0 for 1 .. grep !( $_ % 2 ), @c; }, } );` [download] Rate map_grep for grep_for map for2 grep_x map_grep 10.6/s -- -27% -43% -47% -51% -54% for 14.6/s 38% -- -21% -27% -33% -36% grep_for 18.4/s 75% 27% -- -7% -15% -20% map 19.9/s 88% 36% 8% -- -9% -13% for2 21.7/s 106% 49% 18% 9% -- -5% grep_x 22.9/s 117% 57% 24% 15% 6% -- Rate map_grep for grep_for map for2 grep_x map_grep 12.2/s -- -16% -34% -40% -45% -49% for 14.6/s 20% -- -22% -29% -35% -39% grep_for 18.6/s 52% 27% -- -9% -17% -23% map 20.5/s 68% 40% 10% -- -8% -15% for2 22.4/s 83% 53% 20% 9% -- -7% grep_x 24.1/s 97% 65% 29% 17% 8% -- Rate map_grep for grep_for map for2 grep_x map_grep 12.1/s -- -19% -31% -36% -45% -49% for 14.8/s 23% -- -15% -21% -32% -37% grep_for 17.4/s 44% 17% -- -8% -21% -27% map 18.8/s 56% 27% 8% -- -14% -21% for2 21.9/s 82% 48% 26% 16% -- -8% grep_x 23.7/s 97% 60% 37% 26% 8% -- [download]	[reply] [d/l] [select]
Re: Speed comparison of foreach vs grep + map by GrandFather (Saint) on May 25, 2025 at 21:23 UTC
Do you have a benchmarked application where the benchmark is showing that foreach/grep,map is the specific bottleneck and causing tens of seconds of extra run time? Because if not, choose the best tool to make the code as easy as possible to understand. You will usually save a lot more time writing maintainable code than seeking the fastest runtime you can manage. If you do need to improve runtime you should try to be smart rather than clever. A smart solution is likely to approach the problem from an entirely different direction. A clever solution is likely to be a nasty mess arrived at by iterative tweaking the code to squeeze the last drop of performance out of a sub-optimum algorithm. Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond	[reply]
Re^2: Speed comparison of foreach vs grep + map by eyepopslikeamosquito (Archbishop) on May 26, 2025 at 04:37 UTC
If you do need to improve runtime you should try to be smart rather than clever. A smart solution is likely to approach the problem from an entirely different direction. A clever solution is likely to be a nasty mess arrived at by iterative tweaking the code to squeeze the last drop of performance out of a sub-optimum algorithm. -- GrandFather Love it! ... so much that I could not restrain myself from adding it to on Code Optimization and Performance References. :-) BTW, my favourite quote from that node is from Michael Abrash: Without good design, good algorithms, and complete understanding of the program's operation, your carefully optimized code will amount to one of mankind's least fruitful creations -- a fast slow program. 👁️🍾👍🦟	[reply]
Re^2: Speed comparison of foreach vs grep + map by mldvx4 (Friar) on May 26, 2025 at 09:25 UTC
Do you have a benchmarked application where the benchmark is showing that foreach/grep,map is the specific bottleneck and causing tens of seconds of extra run time? I am processing some medium-sized files (1.6 MB to 2.5 MB etc) and notice that the run time goes up from ~ 3 s to 5+ s with `grep` and `map` compared to `foreach`. However, I see now that I can tune things better.	[reply] [d/l] [select]
Re^3: Speed comparison of foreach vs grep + map by eyepopslikeamosquito (Archbishop) on May 26, 2025 at 10:24 UTC
I suggest you create and post a working program using Perl's core Benchmark module that anyone can just cut and paste and run for themselves. Apart from excellent learning for you, doing that should also provoke more helpful responses from the monks. Some example PM nodes that have taken that approach: Re: Confused by RegEx count by choroba Fastest way to lookup a point in a set Re^3: looping efficiency (Benchmark Example) How to do popcount (aka Hamming weight) in Perl (popcount References) Re^2: Speed comparison of foreach vs grep + map by ikegami - example later in this thread Benchmarking `map`, `grep` and `for` Re^2: Perl at Rosetta Code, with one particular example by anonymonk - comparing his solution with tybalt89's solving this rosetta code See also perlperf - Perl Performance and Optimization Techniques (perldoc) on Code Optimization and Performance References Updated: Added more example PM nodes and See also section. 👁️🍾👍🦟	[reply] [d/l] [select]
Re^4: Speed comparison of foreach vs grep + map by mldvx4 (Friar) on May 26, 2025 at 13:13 UTC
Re: Speed comparison of foreach vs grep + map by NERDVANA (Priest) on May 26, 2025 at 17:26 UTC
For large arrays, yes grep and map are slower than iterating the array. If you really get down into the weeds of perl performance, you'll find that any curly brace scope slows things down as well, so `grep { $condition } @list` is slower than `grep $condition, @list`. But, the other posts here are the important bit - focus on optimizing your algorithm before you try to squeeze a few extra percent out of the operators you use. Knowing that map and grep are a bit slower doesn't stop me from using them in my everyday code, because most of my (primarily webapp) code is author-speed limited, not cpu-speed limited. In the rare cases where I want something to run faster, I whip out Devel::NYTProf and then look for the hotspots. Even then, most of the time the report clues me into problems with my algorithm, and I don't need to optimize expressions.	[reply] [d/l] [select]
Re: Speed comparison of foreach vs grep + map by jwkrahn (Abbot) on May 26, 2025 at 03:31 UTC
`$ perl -le' use warnings; use strict; my @c = 1 .. 10_000_000; print @c . " @c[0..19]"; my @d; for my $dd ( @c ) { push @d, $dd % 2; } print @d . " @d[0..19]"; my @e; for my $ee ( @d ) { if ( !$ee ) { push @e, $ee; } } print @e . " @e[0..19]"; ' 10000000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 10000000 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 50000000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0` [download] `$ perl -le' use warnings; use strict; my @c = 1 .. 10_000_000; print @c . " @c[0..19]"; my @d = map $_%2, @c; print @d . " @d[0..19]"; my @e = grep /0/, @d; print @e . " @e[0..19]"; ' 10000000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 10000000 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 5000000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0` [download] The quickest way to do what you want: `$ perl -le' my @c = 1 .. 10_000_000; print @c . " @c[0..19]"; my @e = ( 0 ) x ( @c / 2 ); print @e . " @e[0..19]"; ' 10000000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 5000000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0` [download] Naked blocks are fun! -- Randal L. Schwartz, Perl hacker	[reply] [d/l] [select]
Re^2: Speed comparison of foreach vs grep + map by ikegami (Patriarch) on May 26, 2025 at 12:18 UTC
It's ...odd to assume the array contains 1..N for the last one, but not for the first two.	[reply]
Re: Speed comparison of foreach vs grep + map by LanX (Saint) on May 26, 2025 at 16:47 UTC
I think this is an XY problem You seem to be processing files, those operations are in general magnitudes slower than any loop constructs. For instance slurping instead of iterating can make a huge difference. So I hope you are not comparing apples with oranges. Cheers Rolf _{(addicted to the Perl Programming Language :) see Wikisyntax for the Monastery}	[reply]
Re^2: Speed comparison of foreach vs grep + map by mldvx4 (Friar) on Sep 13, 2025 at 08:34 UTC
Sorry for the delay. I do read in the whole file at once and pass it as a variable to `XML::Twig` and make three consecutive passes. Here are the first, second, and final passes. The first pass just collects some information, the second pass actually starts processing the elements. The are a few `map` and `grep` functions used in the various handlers (not listed): ... my $xml = XML::Twig->new( pretty_print => 'nsgmls', # nsgmls for parsability output_encoding => 'UTF-8', twig_roots => { 'office:automatic-styles' => 1 }, twig_handlers => { 'style:style[@style:family="text"]/style:text-properties' => \ +&handler_style_collector, 'style:style' => \&handler_paragraph_style_collector, }, ); # $content is not saved from the first pass, it only builds some hashe +s $xml->parse($content); $xml->dispose; $xml = XML::Twig->new( pretty_print => 'nsgmls', # nsgmls for parsability output_encoding => 'UTF-8', twig_roots => { 'office:body' => 1 }, twig_handlers => { # link anchors (text:boomark) must be handled before # processing the internal links (text:a) '*[text:bookmark]' => \&handler_bookmark, 'text:note[@text:note-class="footnote"]/text:note-body' => \&handler_footnotes, 'text:note-citation' => \&handler_citation, # only some ki +nds 'text:span' => \&handler_span, # typographic markup 'text:list-item' => \&handler_list_item, # all lists becom +e unordered 'table:table-header-rows' => \&handler_table_header_rows, 'table:table-row' => \&handler_table_row, 'table:table' => \&handler_table, # primitive table supp +ort 'text:line-break' => \&handler_line_break, 'text:table-of-content' => sub { $_->delete }, 'text:index-body' => sub { $_->delete }, 'text:alphabetical-index' => sub { $_->delete }, }, ); $xml->parse($content); $content = $xml->sprint; $xml->dispose; $xml = XML::Twig->new( pretty_print => 'nsgmls', empty_tags => 'html', output_encoding => 'UTF-8', twig_roots => { 'office:body' => 1 }, twig_handlers => { # links (text:a) must be handled after the link targets (text: +bookmark) 'text:a' => \&handler_links, 'text:h' => \&handler_h, 'text:p' => \&handler_p, 'draw:frame' => \&handler_draw_frame, 'office:annotation' => sub { $_->delete }, 'office:annotation-end' => sub { $_->delete }, 'text:sequence-decls' => sub { $_->delete }, 'text:tracked-changes' => sub { $_->delete }, 'text:table-of-content' => sub { $_->delete }, 'office:forms' => sub { $_->delete }, 'text:list' => \&handler_lift_up, 'text:section' => \&handler_lift_up, 'office:body' => \&handler_lift_up, 'office:text' => \&handler_lift_up, }, ); $xml->parse($content); $content = $xml->sprint; $xml->dispose; . . . [download] The first pass is necessary to collect some information about typographical markup before actual processing begins. Then there are some manipulations which have to be kept separate for the second and third passes, but that helps because spreading some of the other handlers across the two passes seems to speed up processing. Then there is some regex tidying of the markdown stored in `$content` afterward. It seemed best to do that as regex. It's a bit moot at this point, though: While the script does the job quite nicely on the use-cases I've tried it on, the person I wrote it for has access to additional data (for his specific use-case) so he was inspired to write a second version which is more tightly couple with that data.	[reply] [d/l] [select]