Re: Confused by RegEx count

Other monks have already explained what's going on. Let me point to efficiency of the solutions:

Note that the transliteration is much faster than the other option. Even when the character is variable and we have to use string eval (whip! whip!), it's much faster.

Instead of using substitution with length, you can use global substitution only, as it returns the number of replacements in scalar context. But it's still slower than transliteration:

#! /usr/bin/perl
use warnings;
use strict;

use Benchmark qw{ cmpthese };

my $orig = 'Just another Perl hacker,' x 100;
my $str  = $orig;
my $char = 'r';
my $q    = quotemeta $char;

sub transliteration {
    my $count = eval "\$str =~ tr/$q//"
}

sub length_subst {
    my $count = length( $str =~ s/[^$q]//rg )
}

sub subst {
    my $count = $str =~ s/$q/$char/g
}

transliteration() eq length_subst() or die 'Different t-ls';
transliteration() eq subst()        or die 'Different t-s';
$orig eq $str or die 'Changed';

cmpthese(-3, {
   transliteration => \&transliteration,
   length_subst    => \&length_subst,
   subst           => \&subst,
});

__END__
                    Rate    length_subst           subst transliterati
+on
length_subst      2833/s              --            -91%            -9
+7%
subst            30244/s            968%              --            -7
+0%
transliteration 102423/s           3515%            239%              
+--
[download]

Update: Introduced quotemeta to transliteration, too. It didn't change the results significantly.

map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

Comment on Re: Confused by RegEx count Select or Download Code

Replies are listed 'Best First'.
Re^2: Confused by RegEx count by NERDVANA (Priest) on Feb 21, 2024 at 00:06 UTC
Or you could just match instead of substitute, for a few percent faster. `sub match { my $count =()= $str =~ /$q/g }` [download]	[reply] [d/l]
Re^3: Confused by RegEx count by choroba (Cardinal) on Feb 21, 2024 at 09:23 UTC
Interestingly, on my machine: `Rate length_subst match subst trans +literation length_subst 2864/s -- -89% -90% + -97% match 25687/s 797% -- -13% + -74% subst 29356/s 925% 14% -- + -70% transliteration 98682/s 3346% 284% 236% + --` [download] `map{substr$_->[0],$_->[1]\|\|0,1}[\\|\|{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^ARGV,3]`	[reply] [d/l] [select]
Re^4: Confused by RegEx count by hippo (Archbishop) on Feb 21, 2024 at 09:55 UTC
That's rather counter-intuitive. Just ran them myself and for me, match beats subst: `Rate length_subst subst match trans +literation length_subst 2080/s -- -88% -90% + -97% subst 17998/s 765% -- -16% + -77% match 21423/s 930% 19% -- + -72% transliteration 76797/s 3592% 327% 258% + --` [download] This is perl 5, version 34, subversion 0 (v5.34.0) built for x86_64-linux-thread-multi. 🦛	[reply] [d/l]
Re^5: Confused by RegEx count by Danny (Chaplain) on Feb 21, 2024 at 11:41 UTC
Re^5: Confused by RegEx count by choroba (Cardinal) on Feb 22, 2024 at 10:09 UTC
Re^4: Confused by RegEx count by NERDVANA (Priest) on Feb 22, 2024 at 06:17 UTC
I get results like hippo's with perl 5.36 on a recent AMD Ryzen `Rate length_subst subst match trans +literation length_subst 8970/s -- -89% -91% + -97% subst 78884/s 779% -- -19% + -78% match 97126/s 983% 23% -- + -73% transliteration 355019/s 3858% 350% 266% + --` [download] Though, depending how many matches there are, does perl have to assemble a stack of N elements (copying each character into its own scalar) before assigning the list to the scalar to get the count? With the subst, the right optimizations could allow it to update that one character without changing the length of the string or copying anything, so it could be fast, and then doesn't need to assemble a list of the matches.	[reply] [d/l]