I am to read many textual files which represent 2D data/matrices, where "interesting" lines contain column and row indexes. Some of them should be skipped. In fact, the whole project works great and fast enough, I'm just puzzled, idly, at benchmarks when later I sought to "improve"/refactor. Data and code are reduced to nonsense for SSCCE.

use strict; use warnings; use feature 'say'; use List::Util 'any'; use Benchmark 'cmpthese'; my $data = ''; for my $r ( 0 .. 31 ) { for my $c ( 0 .. 31 ) { $data .= "$c $r whatever\n" } } # say $data; die; my @skip = ( 0, 15, 16, 31 ); cmpthese -1, { ugly => sub { while ( $data =~ /^(\d+) (\d+)/mg ) { next if $1 == 0 or $1 == 15 or $1 == 16 or $1 == 31; next if $2 == 0 or $2 == 15 or $2 == 16 or $2 == 31; # something useful happens here, # after uninteresting entries have been skipped } return 1 }, ugly_cr => sub { while ( $data =~ /^(\d+) (\d+)/mg ) { my ( $c, $r ) = ( $1, $2 ); next if $c == 0 or $c == 15 or $c == 16 or $c == 31; next if $r == 0 or $r == 15 or $r == 16 or $r == 31; } return 1 }, any => sub { while ( $data =~ /^(\d+) (\d+)/mg ) { next if any { $1 == $_ } @skip; next if any { $2 == $_ } @skip; } return 1 }, any_cr => sub { while ( $data =~ /^(\d+) (\d+)/mg ) { my ( $c, $r ) = ( $1, $2 ); next if any { $c == $_ } @skip; next if any { $r == $_ } @skip; } return 1 } };

Output:

Rate any_cr any ugly ugly_cr any_cr 331/s -- -54% -64% -74% any 724/s 119% -- -22% -43% ugly 930/s 181% 28% -- -26% ugly_cr 1265/s 282% 75% 36% --

Initial/working code is similar to "ugly_cr". Then I thought maybe I'd postpone assignment to lexicals until filtering out irrelevant lines. Will it be faster? No. The fact that "ugly" gets slower I speculate is related to $1, etc. being read-only, they are numified on each of the 4 comparisons. Is this correct?

Then maybe "any" because it's XS will be fast and nice to look at and easy to add more r/c to skip later? It's a little slow for just 4 elements in array to skip, I wouldn't be surprised too much about result I got. What I'm completely puzzled about is "any_cr" is slower yet. Why? And why asymmetry about "ugly vs. ugly_cr" and "any vs. any_cr"? I don't understand.


In reply to Why is "any" slow in this case? by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.