Re^3: Speed comparison of foreach vs grep + map

I suggest you create and post a working program using Perl's core Benchmark module that anyone can just cut and paste and run for themselves. Apart from excellent learning for you, doing that should also provoke more helpful responses from the monks.

Some example PM nodes that have taken that approach:

Re: Confused by RegEx count by choroba
Fastest way to lookup a point in a set
Re^3: looping efficiency (Benchmark Example)
How to do popcount (aka Hamming weight) in Perl (popcount References)
Re^2: Speed comparison of foreach vs grep + map by ikegami - example later in this thread Benchmarking map, grep and for
Re^2: Perl at Rosetta Code, with one particular example by anonymonk - comparing his solution with tybalt89's solving this rosetta code

See also

perlperf - Perl Performance and Optimization Techniques (perldoc)
on Code Optimization and Performance References

Updated: Added more example PM nodes and See also section.

👁️🍾👍🦟

Comment on Re^3: Speed comparison of foreach vs grep + map Select or Download Code

Replies are listed 'Best First'.
Re^4: Speed comparison of foreach vs grep + map by mldvx4 (Friar) on May 26, 2025 at 13:13 UTC
Thanks, yet again. The `Benchmark qw(:hireswallclock)` module really shows the difference between the two code segments: `my $t0 = Benchmark->new; # build TOC foreach my $h (split(/\n/, $page)) { if ($h =~ m/^={2,3}\s/) { push(@toc, $file."\t".$h); } } my $t1 = Benchmark->new; my $td = timediff($t1, $t0); print "the code took:",timestr($td),"\n";` [download] and `my $t0 = Benchmark->new; # build TOC push(@toc, map(m/^={2,3}\s/ ? $file."\t".$_ : (), split(/\n/, $page) ) ); my $t1 = Benchmark->new; my $td = timediff($t1, $t0); print "the code took:",timestr($td),"\n";` [download] With the data sets I have, the first one is about a second faster per subset of data. I hope those two segments are similar enough to compare. I don't think I can use `substr()` because the pieces sought are of variable length and alternations slows it way down. `foreach my $h (split(/\n/, $page)) { if (substr($h,0,3) eq '== ' or substr($h,0,4) eq '=== ') { push(@toc, $file."\t".$h); } }` [download] That adds up to 1.5 seconds. For what it's worth, the data is by that point markdown text — and a lot of it. The sources are OpenDocument Format word processing files. I got the first 90% of the task done in pure XML using `XML::Twig` in two afternoons, thanks to advice here. It's the second 90%: dealing with nested lists, cross-references, and individual chapters which is taking even more time. It seems that few are using sections within their ODF files, and so "chapters" end up being all part of the same parent element. Parkinson's law applies as well.	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^4: Speed comparison of foreach vs grep + map
by mldvx4 (Friar) on May 26, 2025 at 13:13 UTC

Thanks, yet again. The Benchmark qw(:hireswallclock) module really shows the difference between the two code segments:

        my $t0 = Benchmark->new;
        # build TOC
                 
        foreach my $h (split(/\n/, $page)) {
            if ($h =~ m/^={2,3}\s/) {
                push(@toc, $file."\t".$h);
            }
        }
        my $t1 = Benchmark->new;
        my $td = timediff($t1, $t0);
        print "the code took:",timestr($td),"\n";
[download]

and

        my $t0 = Benchmark->new;
        # build TOC
        push(@toc,
              map(m/^={2,3}\s/ ? $file."\t".$_ : (),
                   split(/\n/, $page) ) );  
        my $t1 = Benchmark->new;
        my $td = timediff($t1, $t0);
        print "the code took:",timestr($td),"\n";
[download]

With the data sets I have, the first one is about a second faster per subset of data. I hope those two segments are similar enough to compare. I don't think I can use substr() because the pieces sought are of variable length and alternations slows it way down.

        foreach my $h (split(/\n/, $page)) {
            if (substr($h,0,3) eq '== ' 
                 or substr($h,0,4) eq '=== ') {
                push(@toc, $file."\t".$h);
            }
        }
[download]

That adds up to 1.5 seconds.

For what it's worth, the data is by that point markdown text — and a lot of it. The sources are OpenDocument Format word processing files. I got the first 90% of the task done in pure XML using XML::Twig in two afternoons, thanks to advice here. It's the second 90%: dealing with nested lists, cross-references, and individual chapters which is taking even more time. It seems that few are using sections within their ODF files, and so "chapters" end up being all part of the same parent element. Parkinson's law applies as well.

[reply]
[d/l]
[select]