in reply to Why is "any" slow in this case?

A hash seems the best option in the case described. It is also faster by several orders of magnitude (for your benchmarks, however accurate it may be).
my %hash; @hash{0, 15, 16, 31} = (); # then add this to your benchmark hash => sub { while ( $data =~ /^(\d+) (\d+)/mg ) { next if exists $hash{$1} or exists $hash{$2}; return 1; } },
Rate any_cr any ugly ugly_cr hash2 hash any_cr 865/s -- -37% -54% -65% -100% -100% any 1382/s 60% -- -27% -44% -100% -100% ugly 1896/s 119% 37% -- -24% -100% -100% ugly_cr 2489/s 188% 80% 31% -- -100% -100% hash 3084047/s 356493% 222992% 162532% 123813% 26% --

Replies are listed 'Best First'.
Re^2: Why is "any" slow in this case?
by Anonymous Monk on Jul 28, 2025 at 06:39 UTC
    You have the 'return 1' in the wrong place...
      oops, my bad:
      Rate any_cr any ugly ugly_cr hash any_cr 869/s -- -38% -54% -65% -77% any 1395/s 61% -- -26% -43% -63% ugly 1896/s 118% 36% -- -23% -49% ugly_cr 2465/s 184% 77% 30% -- -34% hash 3725/s 329% 167% 96% 51% --
        On a tangent, I often thought Benchmark would have been better designed with an additional testing interface, allowing to make sure one is not comparing apples with oranges.

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        see Wikisyntax for the Monastery

Re^2: Why is "any" slow in this case?
by Anonymous Monk on Jul 28, 2025 at 11:20 UTC

    Thanks for the tip, I'll use hash look-up in refactored version. + I was wrong about numification, the picture remains the same with string comparison (hello, AI). The unexpected outcome (for me) is "never access $1, etc. more than twice per regexp executed, but assign results to throwaway lexicals instead. Even if 'access' is masked/folded in loops". Interesting. The exception of any_cr remains unresolved mystery. Thanks everyone (except "AI" with its rubbish, which was NOT interesting). Disappointed as usual about the latter.

      Three remarks

      • You didn't need to use $1 etc at all, a regex will return the captures in list context. my @matches = ( $str =~ /pa(tt)ern/g )
      • I suppose the trie optimization of alternate numbers directly inside a negative look ahead (?!(0|15|16|31)\D)(\d+) to be a very fast alternative.°
      • the AI discussion happened in the context of another meditation, I only referenced it here for completeness.

      Happy testing!

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      see Wikisyntax for the Monastery

      °) TIMTOWTDI

        hash => sub { while ( $data =~ /^(\d+) (\d+)/mg ) { next if exists $hash{$1} or exists $hash{$2}; } return 1; }, ahead => sub { while ( $data =~ /^(?!(?:0|15|16|31)\D)(\d+)\N{SPACE} (?!(?:0|15|16|31)\D)(\d+)/mgx ) { } return 1; }, Rate hash ahead hash 1742/s -- -31% ahead 2541/s 46% --

        Wow, thanks. I'll refactor with this, then. As to (1), simply generating a list of few hundred captures is slower, even without working it in pairs later. Sorry about (3), I didn't mean to reproach anyone.

      "Thanks everyone (except "AI" with its rubbish, which was NOT interesting). Disappointed as usual about the latter.”

      👏