in reply to Why is "any" slow in this case?

As for ugly vs ugly_cr,

$1 is a magic variable. Everytime you read from it, it gets repopulated (the matched substring is copied into it from the matched string) and subsequently numified. ugly_cr is faster because it cuts down the number of times that happens by four.

As for any vs any_cr,

The anon subs in any don't capture any variables, but the ones in any_cr capture two. Introducing capturing adds overhead that's more expensive than the magic on $1.


Update: Added "and subsequently numified".

Update: Confirmed that the overhead from capturing is the culprit, and adjusted the text appropriately. I confirmed this by changing all four any { ... } to any { $data; ... }. With this change, any becomes slower than any_cr.

Replies are listed 'Best First'.
Re^2: Why is "any" slow in this case?
by Anonymous Monk on Jul 29, 2025 at 07:12 UTC
    The anon subs in any don't capture any variables, but the ones in any_cr capture two. Introducing capturing adds overhead

    "for_cr" being slower than "for" confirms what you say; there's symmetry between "for_cr vs. for" and "any_cr vs. any". However, "grep_cr" doesn't seem to suffer from this capturing. Is its subroutine very different? Yet further, injection of "data;" into beginning of blocks, as you did, makes them all slow, including "grep_cr". Capturing "$c" and "$r" is OK, capturing "$data" is penalised. Something is still amiss.

    for => sub { W: while ( $data =~ /^(\d+) (\d+)/mg ) { for ( @skip ) { next W if ( sub { $1 eq $_ })-> ()} for ( @skip ) { next W if ( sub { $2 eq $_ })-> ()} } return 1; }, for_cr => sub { W: while ( $data =~ /^(\d+) (\d+)/mg ) { my ( $c, $r ) = ( $1, $2 ); for ( @skip ) { next W if ( sub { $c eq $_ })-> ()} for ( @skip ) { next W if ( sub { $r eq $_ })-> ()} } return 1; }, grep => sub { while ( $data =~ /^(\d+) (\d+)/mg ) { next if grep { $1 eq $_ } @skip; next if grep { $2 eq $_ } @skip; } return 1 }, grep_cr => sub { while ( $data =~ /^(\d+) (\d+)/mg ) { my ( $c, $r ) = ( $1, $2 ); next if grep { $c == $_ } @skip; next if grep { $r == $_ } @skip; } return 1 }, grep_data => sub { while ( $data =~ /^(\d+) (\d+)/mg ) { next if grep { $data; $1 eq $_ } @skip; next if grep { $data; $2 eq $_ } @skip; } return 1 }, grep_cr_data => sub { while ( $data =~ /^(\d+) (\d+)/mg ) { my ( $c, $r ) = ( $1, $2 ); next if grep { $data; $c == $_ } @skip; next if grep { $data; $r == $_ } @skip; } return 1 }, any => sub { while ( $data =~ /^(\d+) (\d+)/mg ) { next if any { $1 == $_ } @skip; next if any { $2 == $_ } @skip; } return 1 }, any_cr => sub { while ( $data =~ /^(\d+) (\d+)/mg ) { my ( $c, $r ) = ( $1, $2 ); next if any { $c == $_ } @skip; next if any { $r == $_ } @skip; } return 1 }, Rate for_cr for grep_data any_cr grep_cr_data any gr +ep grep_cr for_cr 96.4/s -- -66% -69% -70% -71% -86% -8 +8% -90% for 285/s 196% -- -8% -13% -14% -60% -6 +4% -71% grep_data 311/s 223% 9% -- -5% -6% -56% -6 +1% -68% any_cr 326/s 239% 14% 5% -- -1% -54% -5 +9% -66% grep_cr_data 331/s 244% 16% 6% 1% -- -54% -5 +9% -66% any 714/s 641% 150% 129% 119% 116% -- -1 +1% -26% grep 799/s 729% 180% 157% 145% 141% 12% +-- -18% grep_cr 968/s 905% 239% 211% 197% 192% 36% 2 +1% --

      "grep_cr" doesn't seem to suffer from this capturing.

      Correct.

      List::Util::any BLOCK LIST is syntactic sugar for List::Util::any sub BLOCK, LIST because of its prototype.

      $ perl -Mv5.14 -MList::Util=any -e'say 0+any { $_ > 3 } 1..5' 1 $ perl -Mv5.14 -MList::Util=any -e'say 0+any sub { $_ > 3 }, 1..5' 1

      A sub's access to the variables of the lexical scope in which its defined is called capturing. (A sub that captures is called a closure.)

      That's not the case for CORE::grep and CORE::any's blocks. Their blocks are no more subroutines than while's.

      $ perl -MO=Concise,-exec -MList::Util=any -e'any { /x/ } @a' 1 <0> enter v 2 <;> nextstate(main 31 -e:1) v:{ 3 <0> pushmark s 4 <$> anoncode[CV CODE] sRM 5 <#> gv[*a] s 6 <1> rv2av[t4] lKM/1 7 <#> gv[*any] s 8 <1> entersub[t5] vKS/TARG 9 <@> leave[1 ref] vKP/REFC -e syntax OK
      $ perl -MO=Concise,-exec -e'grep { /x/ } @a' 1 <0> enter v 2 <;> nextstate(main 1 -e:1) v:{ 3 <0> pushmark s 4 <#> gv[*a] s 5 <1> rv2av[t2] lKM/1 6 <@> grepstart K 7 <|> grepwhile(other->8)[t3] vK 8 <0> enter s 9 <;> nextstate(main 2 -e:1) v:{ a </> match(/"x"/) s b <@> leave sKP goto 7 c <@> leave[1 ref] vKP/REFC -e syntax OK

      Note the anoncode (sub { }) in one, and the actual code of the block (match) in the other.

      grep => sub { while ( $data =~ /^(\d+) (\d+)/mg ) { next if grep { $1 eq $_ } @skip; next if grep { $2 eq $_ } @skip; } return 1 }, grep_1 => sub { while ( $data =~ /^(\d+) (\d+)/mg ) { next if grep { 1; $1 eq $_ } @skip; next if grep { 1; $2 eq $_ } @skip; } return 1 }, grep_1 316/s -- -61% grep 811/s 156% --

      Writing as "answer" to self, because I really don't want to ping anyone, moreover request explanation; these tests are becoming stupid in addition to idle. Apparently "grep" is capable to optimise its braces (block) away (sometimes. Though not in case of e.g. grep { /x/ } @a, but that's digression):

      perl -MO=Concise,-exec -e "grep { $1 eq $_ } @a" 1 <0> enter v 2 <;> nextstate(main 1 -e:1) v:{ 3 <0> pushmark s 4 <#> gv[*a] s 5 <1> rv2av[t4] lKM/1 6 <@> grepstart K 7 <|> grepwhile(other->8)[t5] vK 8 <#> gvsv[*1] s 9 <#> gvsv[*_] s a <2> seq sK/2 goto 7 b <@> leave[1 ref] vKP/REFC

      same output without a block i.e. for grep $1 eq $_, @a. But:

      perl -MO=Concise,-exec -e "grep { 1; $1 eq $_ } @a" 1 <0> enter v 2 <;> nextstate(main 1 -e:1) v:{ 3 <0> pushmark s 4 <#> gv[*a] s 5 <1> rv2av[t4] lKM/1 6 <@> grepstart K 7 <|> grepwhile(other->8)[t5] vK 8 <0> enter s 9 <;> nextstate(main 2 -e:1) v a <#> gvsv[*1] s b <#> gvsv[*_] s c <2> seq sK/2 d <@> leave sKP goto 7 e <@> leave[1 ref] vKP/REFC

      And so grep's not-really-anon-sub-but-something-else, even if it doesn't capture outside vars here, is significantly slower than any's real-anon-sub when it doesn't capture vars too, and actually as slow as the latter when it captures vars.

Re^2: Why is "any" slow in this case?
by LanX (Saint) on Jul 28, 2025 at 15:51 UTC
    > Everytime you read from it, it gets repopulated (the matched substring is copied into it from the matched string).

    Do you happen to know why? I can't see any side-effects justifying this behaviour.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery

      So the regex engine doesn't waste any time updating globals that may not even get used. Instead the cost is deferred until the global is accessed, and incurred every time it is accessed because that's how magic works.

      That's how magic variables works. Every time you can read a variable with get magic, a getter function is called to populate it first. Every time you write to a variable with set magic, a setter function is called to process the new value afterwards.

      use v5.40; use Variable::Magic qw( cast wizard ); my $wiz = wizard( get => sub { say sprintf 'getter called for %X', refaddr( $_[0] ); ${ $_[0] } = int( rand( 100 ) ); }, ); my $var; say sprintf '`$var` is %X', refaddr( \$var ); cast $var, $wiz; for ( 1 .. 4 ) { say "Loop: $_"; say "`\$var` has value $var"; }
      `$var` is 5F5668128E20 Loop: 1 getter called for 5F5668128E20 `$var` has value 22 Loop: 2 getter called for 5F5668128E20 `$var` has value 62 Loop: 3 getter called for 5F5668128E20 `$var` has value 44 Loop: 4 getter called for 5F5668128E20 `$var` has value 70

      The alternative to magic would be to preemptively copy substrings of the matched string into $`, $&, $' and $n, $+{name} and $-{name}.

        Well the question could be rephrased into

        Why do we need $1 to be magic, when it's read-only?

        Threading?

        > would be to preemptively copy substrings of the matched string into

        if this is about avoiding overhead for optional variables, retrieving it just once on demand would be sufficient.

        The OP is accessing the same $1 4 times in a row.

        Update

        the code examples were updated in the parent post while I was replying.

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        see Wikisyntax for the Monastery