The other day here, someone mentioned the idea of using

  grep defined, @array

rather than the more usual block form

  grep {defined} @array

the reason being that since it avoids creating a block scope the former code will run faster than the latter. Indeed, decompiling the two snippets tends to confirm the hypothesis (all op-codes being equal).

# perl -MO=Concise -e 'grep {defined} @array' a <@> leave[1 ref] vKP/REFC ->(end) 1 <0> enter ->2 2 <;> nextstate(main 2 -e:1) v ->3 7 <|> grepwhile(other->8)[t2] vK/1 ->a 6 <@> grepstart K*/2 ->7 3 <0> pushmark s ->4 - <1> null lK/1 ->4 - <1> null sK/1 ->7 - <@> scope sK ->7 - <0> ex-nextstate v ->8 9 <1> defined sK/1 ->- - <1> ex-rv2sv sK*/1 ->9 8 <$> gvsv(*_) s ->9 5 <1> rv2av[t1] lKM/1 ->6 4 <$> gv(*array) s ->5 # perl -MO=Concise -e 'grep defined, @array' a <@> leave[1 ref] vKP/REFC ->(end) 1 <0> enter ->2 2 <;> nextstate(main 1 -e:1) v ->3 7 <|> grepwhile(other->8)[t2] vK/1 ->a 6 <@> grepstart K/2 ->7 3 <0> pushmark s ->4 - <1> null lK/1 ->4 9 <1> defined sK/1 ->7 - <1> ex-rv2sv sK*/1 ->9 8 <$> gvsv(*_) s ->9 5 <1> rv2av[t1] lKM/1 ->6 4 <$> gv(*array) s ->5

As it turns out, I've got some code that is too slow, and a recurring theme in that code is the fact that I have lists that may have an empty string in them, and I need to process all the non-empty strings. Hence there are many loops of the form

  for my $element (grep {$_ ne ''} @array {...}

So I've been wondering what, if any, impact it would have on performance if I were to change over to the blockless form. So I set up a benchmark with the following items:

And I ran those there snippets against a medium-length array with no empty strings and a short and a long array with an empty string. Although I should point out that my usual definition of long is a hundred thousand or a million, but I was interested in a specific domain, where long is never more than twenty or so.

The results surprised me. In fact, the results are so much all over the map that it is difficult to draw any conclusions. The results that interest me the most are those of short and none. As it happens, these two show the greatest differences, alas, they are just about mirror images of each other in terms of performance.

But to a first approximation, the performance of grep {code} @list is not shabby at all, and the remaining differences are lost in the noise.

Rate short_block_ne short_bare_ne short_bare_len +short_block_len short_block_ne 3008231/s -- -0% -10% + -20% short_bare_ne 3014515/s 0% -- -10% + -20% short_bare_len 3342828/s 11% 11% -- + -12% short_block_len 3783871/s 26% 26% 13% + -- Rate none_bare_len none_block_len none_bare_ne non +e_block_ne none_bare_len 3390307/s -- -9% -11% + -13% none_block_len 3708523/s 9% -- -3% + -5% none_bare_ne 3807187/s 12% 3% -- + -2% none_block_ne 3885959/s 15% 5% 2% + -- Rate long_bare_ne long_block_len long_block_ne lon +g_bare_len long_bare_ne 3635361/s -- -6% -6% + -8% long_block_len 3869054/s 6% -- -0% + -2% long_block_ne 3872708/s 7% 0% -- + -2% long_bare_len 3963159/s 9% 2% 2% + --

And the benchmark code:

#! /usr/bin/perl -w use strict; use Benchmark 'cmpthese'; my @none = ('a' .. 'm' ); my @short = ('a', ''); my @long = ('a' .. 'z', ''); my $iter = shift || -1; print "block_ne : [$_]\n" for grep {$_ ne ''} @short; print "block_len: [$_]\n" for grep {length} @short; print "bare_ne : [$_]\n" for grep $_ ne '', @short; print "bare_len : [$_]\n" for grep length, @short; cmpthese( $iter, { short_block_ne => q{grep {$_ ne ''} @short}, short_block_len => q{grep {length} @short}, short_bare_ne => q{grep $_ ne '', @short}, short_bare_len => q{grep length, @short}, } ); cmpthese( $iter, { none_block_ne => q{grep {$_ ne ''} @none}, none_block_len => q{grep {length} @none}, none_bare_ne => q{grep $_ ne '', @none}, none_bare_len => q{grep length, @none}, } ); cmpthese( $iter,{ long_block_ne => q{grep {$_ ne ''} @long}, long_block_len => q{grep {length} @long}, long_bare_ne => q{grep $_ ne '', @long}, long_bare_len => q{grep length, @long}, } );

I think I'll change grep {$_ ne ''} to grep {length} and be done with it.

• another intruder with the mooring in the heart of the Perl


In reply to Benchmarking the block and list forms of grep by grinder

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.