I am to read many textual files which represent 2D data/matrices, where "interesting" lines contain column and row indexes. Some of them should be skipped. In fact, the whole project works great and fast enough, I'm just puzzled, idly, at benchmarks when later I sought to "improve"/refactor. Data and code are reduced to nonsense for SSCCE.
use strict; use warnings; use feature 'say'; use List::Util 'any'; use Benchmark 'cmpthese'; my $data = ''; for my $r ( 0 .. 31 ) { for my $c ( 0 .. 31 ) { $data .= "$c $r whatever\n" } } # say $data; die; my @skip = ( 0, 15, 16, 31 ); cmpthese -1, { ugly => sub { while ( $data =~ /^(\d+) (\d+)/mg ) { next if $1 == 0 or $1 == 15 or $1 == 16 or $1 == 31; next if $2 == 0 or $2 == 15 or $2 == 16 or $2 == 31; # something useful happens here, # after uninteresting entries have been skipped } return 1 }, ugly_cr => sub { while ( $data =~ /^(\d+) (\d+)/mg ) { my ( $c, $r ) = ( $1, $2 ); next if $c == 0 or $c == 15 or $c == 16 or $c == 31; next if $r == 0 or $r == 15 or $r == 16 or $r == 31; } return 1 }, any => sub { while ( $data =~ /^(\d+) (\d+)/mg ) { next if any { $1 == $_ } @skip; next if any { $2 == $_ } @skip; } return 1 }, any_cr => sub { while ( $data =~ /^(\d+) (\d+)/mg ) { my ( $c, $r ) = ( $1, $2 ); next if any { $c == $_ } @skip; next if any { $r == $_ } @skip; } return 1 } };
Output:
Rate any_cr any ugly ugly_cr any_cr 331/s -- -54% -64% -74% any 724/s 119% -- -22% -43% ugly 930/s 181% 28% -- -26% ugly_cr 1265/s 282% 75% 36% --
Initial/working code is similar to "ugly_cr". Then I thought maybe I'd postpone assignment to lexicals until filtering out irrelevant lines. Will it be faster? No. The fact that "ugly" gets slower I speculate is related to $1, etc. being read-only, they are numified on each of the 4 comparisons. Is this correct?
Then maybe "any" because it's XS will be fast and nice to look at and easy to add more r/c to skip later? It's a little slow for just 4 elements in array to skip, I wouldn't be surprised too much about result I got. What I'm completely puzzled about is "any_cr" is slower yet. Why? And why asymmetry about "ugly vs. ugly_cr" and "any vs. any_cr"? I don't understand.
In reply to Why is "any" slow in this case? by Anonymous Monk
For: | Use: | ||
& | & | ||
< | < | ||
> | > | ||
[ | [ | ||
] | ] |