Extracting /regex/mg in a while loop

NetWallah has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Extracting /regex/mg in a while loop by haukex (Archbishop) on Oct 09, 2023 at 22:25 UTC
Your Method #3 doesn't work because a `m//g` regex with capture groups in list context returns a list of the substrings matched by any capturing parentheses in the regular expression, that is, `($1, $2...)`, repeated for each match (see my node here). `my @output = $x=~/(^A\|^B)(.+)$/mg; print Dumper(\@output); # $VAR1 = ['A', ' data for A', 'B', ' data for b', 'A', ' data for A', + ...` [download] I would guess that in Method #2, the overhead of splitting the string into lines and then running a regex match on each of the lines (as opposed to a single regex iterating through a single string) might be what is causing the slowdown. But I would address your comment in Method #1 about using names as follows, and this is the variation I might suggest in this case: named capture groups (see also perlre). `while ( $x =~ m{^ (?<name>A\|B) (?<data>.+) $}xmg ) { print "$+{name} method1 $+{data}\n" }` [download] (There are of course plenty of other ways, such as `m/\G.../gc` parsing.)	[reply] [d/l] [select]
Re: Extracting /regex/mg in a while loop by NERDVANA (Priest) on Oct 10, 2023 at 08:10 UTC
There's always a trade-off between elegance and performance and convenience. But I don't think you should be surprised that #1 is faster. Doing a quick back-of-the-envelope estimation of them might look like: Apply regex to string from pos($x), scanning for /(^A\|^B)/ in performance-tuned C code Update $1 and $2 (which are cheap read-only aliases) and update pos($_) and return true from regex. While opcode evaluates regex return value Enter loop scope Run some opcodes Exit loop scope Perform getline on a file handle Fill a read-buffer with chunks of the scalar (which ends up copying the whole scalar) Update $_ with a copy read from the buffer (which probably ends up copying the whole scalar again) While loop special-case evaluates true when getline succeeds Enter loop scope Initialize two pad variables Run a regex, and execute two opcodes even if it wasn't a line you wanted Perform a double assignment (maybe involves more copying) Run some opcodes Exit loop scope Normally, the performance of something like this wouldn't matter much, unless you're processing really huge data, so I'd say write it whichever way is most convenient and re-usable. (like if you often process files instead of scalars already loaded in memory, then might as well leave it in a read loop) But, if you want to try some other things for performance (no promises, I'm lazy and just suggesting experiments) you could try: `print "$1 method $2\n" while $x =~ /^([AB])(.+)$/mg;` [download] or `# always match the entire string, running your code on matching lines /^(?: (?: ([AB]) (.+) (?{ print "$1 method $2\n"; }) \| .* ) \n? )*/x` [download]	[reply] [d/l] [select]
Re: Extracting /regex/mg in a while loop by kcott (Archbishop) on Oct 10, 2023 at 11:03 UTC
G'day NetWallah, Here's an actual benchmark, using everyone's code (posted to date) plus a couple of my own. I've aimed to keep as close to the originals as possible. Update: My apologies. I made a huge mistake with the code I originally posted. I've stricken that code and placed it in the spoiler below. Here's a largely rewritten version which works correctly. #!/usr/bin/env perl use v5.38; use autodie; use constant { SEP1 => "\n" . '=' x 40, SEP2 => '-' x 40, }; use Benchmark 'cmpthese'; use File::Spec (); my ($null_fh, $base_str); BEGIN { open $null_fh, '>', File::Spec::->devnull(); $base_str = <<~'EOT'; AdataA BdataB CdataC EOT } my @bench_names = qw{net1 net2 hau1 ner1 ner2 kco1 kco2}; my (%bench_for, %benches_to_cmp, $mult_str); say 'Test', SEP1; for my $bench (@bench_names) { no strict 'refs'; ($bench_for{$bench} = \&{$bench})->(\$base_str); use strict 'refs'; say SEP2; $benches_to_cmp{$bench} = sub { $bench_for{$bench}->(\$mult_str, $null_fh); }; } for my $mult (1, 10, 100, 1_000, 10_000) { $mult_str = $base_str x $mult; say 'Bench (Length: ', length($mult_str), ')', SEP1; cmpthese 0 => \%benches_to_cmp; say SEP2; } sub net1 ($rstr, $out_fh = STDOUT) { my $str = $$rstr; while ($str =~ /(^A\|^B)(.+)$/mg){ $out_fh->say("$1 net1 $2"); } return; } sub net2 ($rstr, $out_fh = STDOUT) { my $str = $$rstr; open my $fh, '<', \$str; while (<$fh>) { my ($name, $data) = m/(^A\|^B)(.+)$/ or next; $out_fh->say("$name net2 $data"); } return; } sub hau1 ($rstr, $out_fh = STDOUT) { my $str = $$rstr; while ($str =~ m{^ (?<name>A\|B) (?<data>.+) $}xmg) { $out_fh->say("$+{name} hau1 $+{data}"); } return; } sub ner1 ($rstr, $out_fh = STDOUT) { my $str = $$rstr; $out_fh->say("$1 ner1 $2") while $str =~ /^([AB])(.+)$/mg; return; } sub ner2 ($rstr, $out_fh = STDOUT) { my $str = $$rstr; $str =~ /^([AB])(.++)$(?{ $out_fh->say("$1 ner2 $2") })./m; return; } sub kco1 ($rstr, $out_fh = STDOUT) { my $str = $$rstr; state $re = qr{(?m:^([AB])(.+)$)}; $out_fh->say("$1 kco1 $2") while $str =~ /$re/g; return; } sub kco2 ($rstr, $out_fh = STDOUT) { my $str = $$rstr; state $re = qr{(?m:^([AB])(.+)$)}; $out_fh->say("$1 kco2 $2") while $str =~ /$re/go; return; } [download] Here's the output from a sample run: Test ======================================== A net1 dataA B net1 dataB ---------------------------------------- A net2 dataA B net2 dataB ---------------------------------------- A hau1 dataA B hau1 dataB ---------------------------------------- A ner1 dataA B ner1 dataB ---------------------------------------- A ner2 dataA B ner2 dataB ---------------------------------------- A kco1 dataA B kco1 dataB ---------------------------------------- A kco2 dataA B kco2 dataB ---------------------------------------- Bench (Length: 21) ======================================== Rate net2 hau1 kco1 net1 ner2 ner1 kco2 net2 99524/s -- -41% -55% -59% -59% -63% -64% hau1 168726/s 70% -- -24% -31% -31% -38% -39% kco1 221638/s 123% 31% -- -9% -9% -18% -20% net1 242811/s 144% 44% 10% -- -1% -10% -12% ner2 244388/s 146% 45% 10% 1% -- -10% -12% ner1 270164/s 171% 60% 22% 11% 11% -- -3% kco2 277481/s 179% 64% 25% 14% 14% 3% -- ---------------------------------------- Bench (Length: 210) ======================================== Rate hau1 net2 kco1 net1 ner2 ner1 kco2 hau1 18152/s -- -14% -31% -35% -36% -40% -41% net2 21115/s 16% -- -20% -24% -26% -30% -31% kco1 26414/s 46% 25% -- -5% -7% -12% -14% net1 27930/s 54% 32% 6% -- -2% -7% -9% ner2 28424/s 57% 35% 8% 2% -- -5% -8% ner1 30063/s 66% 42% 14% 8% 6% -- -2% kco2 30792/s 70% 46% 17% 10% 8% 2% -- ---------------------------------------- Bench (Length: 2100) ======================================== Rate hau1 net2 kco1 net1 ner2 ner1 kco2 hau1 1826/s -- -22% -30% -35% -37% -41% -41% net2 2342/s 28% -- -10% -17% -19% -24% -25% kco1 2597/s 42% 11% -- -8% -10% -16% -17% net1 2808/s 54% 20% 8% -- -3% -9% -10% ner2 2882/s 58% 23% 11% 3% -- -7% -7% ner1 3088/s 69% 32% 19% 10% 7% -- -1% kco2 3113/s 70% 33% 20% 11% 8% 1% -- ---------------------------------------- Bench (Length: 21000) ======================================== Rate hau1 net2 kco1 net1 ner2 ner1 kco2 hau1 185/s -- -23% -30% -33% -36% -39% -40% net2 241/s 30% -- -9% -13% -17% -21% -22% kco1 265/s 43% 10% -- -4% -9% -13% -14% net1 277/s 50% 15% 5% -- -4% -9% -10% ner2 290/s 57% 20% 9% 5% -- -5% -6% ner1 305/s 65% 26% 15% 10% 5% -- -1% kco2 308/s 66% 28% 16% 11% 6% 1% -- ---------------------------------------- Bench (Length: 210000) ======================================== Rate hau1 net2 kco1 net1 ner2 kco2 ner1 hau1 18.3/s -- -26% -31% -35% -38% -41% -41% net2 24.8/s 36% -- -7% -12% -16% -20% -21% kco1 26.6/s 45% 7% -- -6% -10% -15% -15% net1 28.2/s 54% 13% 6% -- -5% -9% -10% ner2 29.5/s 61% 19% 11% 5% -- -5% -5% kco2 31.1/s 70% 25% 17% 10% 5% -- -0% ner1 31.2/s 71% 26% 18% 11% 6% 0% -- ---------------------------------------- [download] #!/usr/bin/env perl use v5.38; use autodie; use constant { SEP1 => "\n" . '=' x 40, SEP2 => '-' x 40, }; use Benchmark 'cmpthese'; my ($null_fh, $base_str); BEGIN { open $null_fh, '>', '/dev/null'; $base_str = <<~'EOT'; AdataA BdataB CdataC EOT } say 'Test', SEP1; net1(); say SEP2; net2(); say SEP2; hau1(); say SEP2; ner1(); say SEP2; ner2(); say SEP2; kco1(); say SEP2; kco2(); say SEP2; my $bench_mult = 1_000; say 'Bench', SEP1; cmpthese 0 => { net1 => sub { net1($bench_mult, $null_fh); }, net2 => sub { net2($bench_mult, $null_fh); }, hau1 => sub { hau1($bench_mult, $null_fh); }, ner1 => sub { ner1($bench_mult, $null_fh); }, ner2 => sub { ner2($bench_mult, $null_fh); }, kco1 => sub { kco1($bench_mult, $null_fh); }, kco2 => sub { kco2($bench_mult, $null_fh); }, }; say SEP2; sub net1 ($mult = 1, $out_fh = STDOUT) { state $str = $base_str x $mult; while ($str =~ /(^A\|^B)(.+)$/mg){ $out_fh->say("$1 net1 $2"); } return; } sub net2 ($mult = 1, $out_fh = STDOUT) { state $str = $base_str x $mult; open my $fh, '<', \$str; while (<$fh>) { my ($name, $data) = m/(^A\|^B)(.+)$/ or next; $out_fh->say("$name net2 $data"); } return; } sub hau1 ($mult = 1, $out_fh = STDOUT) { state $str = $base_str x $mult; while ($str =~ m{^ (?<name>A\|B) (?<data>.+) $}xmg) { $out_fh->say("$+{name} hau1 $+{data}"); } return; } sub ner1 ($mult = 1, $out_fh = STDOUT) { state $str = $base_str x $mult; $out_fh->say("$1 ner1 $2") while $str =~ /^([AB])(.+)$/mg; return; } sub ner2 ($mult = 1, $out_fh = STDOUT) { state $str = $base_str x $mult; $str =~ /^([AB])(.++)$(?{ $out_fh->say("$1 ner2 $2") })./m; return; } sub kco1 ($mult = 1, $out_fh = STDOUT) { state $str = $base_str x $mult; state $re = qr{(?m:^([AB])(.+)$)}; $out_fh->say("$1 kco1 $2") while $str =~ /$re/g; return; } sub kco2 ($mult = 1, $out_fh = STDOUT) { state $str = $base_str x $mult; state $re = qr{(?m:^([AB])(.+)$)}; $out_fh->say("$1 kco2 $2") while $str =~ /$re/go; return; } [download] Output: Test ======================================== A net1 dataA B net1 dataB ---------------------------------------- A net2 dataA B net2 dataB ---------------------------------------- A hau1 dataA B hau1 dataB ---------------------------------------- A ner1 dataA B ner1 dataB ---------------------------------------- A ner2 dataA B ner2 dataB ---------------------------------------- A kco1 dataA B kco1 dataB ---------------------------------------- A kco2 dataA B kco2 dataB ---------------------------------------- Bench ======================================== Rate net2 hau1 kco1 ner2 net1 kco2 ner1 net2 97106/s -- -46% -58% -62% -62% -66% -67% hau1 179094/s 84% -- -22% -30% -31% -38% -39% kco1 230504/s 137% 29% -- -11% -11% -20% -22% ner2 257642/s 165% 44% 12% -- -0% -10% -13% net1 258083/s 166% 44% 12% 0% -- -10% -13% kco2 287414/s 196% 60% 25% 12% 11% -- -3% ner1 295538/s 204% 65% 28% 15% 15% 3% -- ---------------------------------------- [download] I changed `$bench_mult = 1_000` to `$bench_mult = 1_000_000` — it made little difference to the results. Test ======================================== ... no change here ... ---------------------------------------- Bench ======================================== Rate net2 hau1 kco1 net1 ner2 kco2 ner1 net2 103431/s -- -43% -56% -61% -61% -65% -66% hau1 180779/s 75% -- -24% -32% -33% -39% -40% kco1 237649/s 130% 31% -- -10% -11% -20% -21% net1 264658/s 156% 46% 11% -- -1% -11% -12% ner2 267892/s 159% 48% 13% 1% -- -10% -11% kco2 297007/s 187% 64% 25% 12% 11% -- -2% ner1 302190/s 192% 67% 27% 14% 13% 2% -- ---------------------------------------- [download] The '`use v5.38;`' represents the version I'm using. You can change that to `v5.36` without needing to change any other parts of the code. And, of course, you can wind it back to earlier versions: the further back you go, the more changes that will be needed. CAVEAT for MSWin users: I expect the `/dev/null` on line 14 will need changing; however, I don't know what would be appropriate. Update: See my updated reply to haj — that CAVEAT is no longer valid for the updated code. — Ken	[reply] [d/l] [select]
Re^2: Extracting /regex/mg in a while loop by haj (Vicar) on Oct 10, 2023 at 11:54 UTC
Nice work! About that CAVEAT for Windows: File::Spec covers that in a portable way: `$devnull = File::Spec->devnull();`	[reply]
Re^3: Extracting /regex/mg in a while loop by kcott (Archbishop) on Oct 10, 2023 at 16:20 UTC
G'day haj, ++ Thanks for the tip. For anyone looking for portability, DRYness, and an easy way to add more tests, replace all the code prior to the start of the subroutine definitions with: Update: I made an error with the original code I posted. I've stricken that code and replaced it with a rewrite which works correctly. That rewrite includes all of the additional features: "portability, DRYness, and an easy way to add more tests". The code I posted below is no longer valid; I've stricken it and placed it in a spoiler. The final paragraph, about adding new tests, is still valid: I left it unchanged. #!/usr/bin/env perl use v5.38; use autodie; use constant { SEP1 => "\n" . '=' x 40, SEP2 => '-' x 40, BENCH_MULT => 1_000, }; use Benchmark 'cmpthese'; use File::Spec (); my ($null_fh, $base_str); BEGIN { open $null_fh, '>', File::Spec::->devnull(); $base_str = <<~'EOT'; AdataA BdataB CdataC EOT } my @bench_names = qw{net1 net2 hau1 ner1 ner2 kco1 kco2}; my (%bench_for, %benches_to_cmp); say 'Test', SEP1; for my $bench (@bench_names) { no strict 'refs'; ($bench_for{$bench} = \&{$bench})->(); use strict 'refs'; say SEP2; $benches_to_cmp{$bench} = sub { $bench_for{$bench}->(BENCH_MULT, $null_fh) }; } say 'Bench', SEP1; cmpthese 0 => \%benches_to_cmp; say SEP2; [download] I've run this a few times: the results are comparable with what I posted earlier. The multiplier for `$base_str` is now the constant `BENCH_MULT`; play around with this if you're so inclined. For more tests, add your subroutine and its name to `@bench_names`; the rest is done for you. — Ken	[reply] [d/l] [select]
Re^2: Extracting /regex/mg in a while loop by Marshall (Canon) on Oct 10, 2023 at 19:18 UTC
I expect the /dev/null on line 14 will need changing The "bit bucket" on Windows is the psuedo file called NUL. This is a reserved file name and you cannot create a file named that. On command line: `type someFile > NUL` reads someFile and sends it to nowhere. You can open a filehandle to NUL and write to it.	[reply] [d/l]
Re^3: Extracting /regex/mg in a while loop by kcott (Archbishop) on Oct 10, 2023 at 23:04 UTC
G'day Marshall, ++haj already pointed me to a portable solution, using devnull() from File::Spec, which I incorporated into my code. `... use File::Spec (); ... open $null_fh, '>', File::Spec::->devnull(); ...` [download] While writing my original code, I had used `state` variables so that the "`# Increase multiplier for benchmarking`" (from NetWallah's OP) would only be called once for each of the multitude of calls by `Benchmark::cmpthese()`. Unfortunately, subsequently adding code (prior to posting) to test each of the subroutines to be benchmarked ("`say 'Test', ...`") caused major problems. I didn't spot that until after I posted: I pretty much rewrote the code to fix these problems and added Updates to explain what I'd done. In order to avoid invalidating what haj had written (as per "How do I change/delete my post?") I left "CAVEAT for MSWin users: ..." but immediately followed it with "Update: ... CAVEAT is no longer valid ..." I appreciate this ended up all a bit messy. My apologies if it caused confusion. — Ken	[reply] [d/l] [select]
Re^2: Extracting /regex/mg in a while loop by NetWallah (Canon) on Oct 11, 2023 at 05:54 UTC
Thanks (++) for putting in the effort to properly encapsulate this in benchmark code. I updated the code somewhat to allow it to run on my slightly older (5.34) Perl. I also added an additional method (kco3) as my preferred way to write your version of the code. Read more... (3 kB) My results are different than yours: $ perl pm-11154897.pl Test ======================================== ---------------------------------------- ---------------------------------------- ---------------------------------------- ---------------------------------------- ---------------------------------------- ---------------------------------------- ---------------------------------------- ---------------------------------------- Bench (Length: 21) ======================================== Rate net2 kco1 kco2 kco3 net1 ner1 hau1 ner2 net2 260654/s -- -90% -92% -92% -97% -97% -97% -97% kco1 2536874/s 873% -- -20% -21% -68% -69% -70% -71% kco2 3178301/s 1119% 25% -- -1% -60% -61% -62% -64% kco3 3207697/s 1131% 26% 1% -- -59% -61% -62% -63% net1 7889843/s 2927% 211% 148% 146% -- -4% -5% -10% ner1 8226198/s 3056% 224% 159% 156% 4% -- -1% -6% hau1 8333730/s 3097% 229% 162% 160% 6% 1% -- -5% ner2 8767731/s 3264% 246% 176% 173% 11% 7% 5% -- ---------------------------------------- Bench (Length: 210) ======================================== Rate net2 kco1 kco3 kco2 net1 hau1 ner1 ner2 net2 260526/s -- -90% -92% -92% -97% -97% -97% -97% kco1 2492787/s 857% -- -20% -20% -68% -69% -69% -71% kco3 3114015/s 1095% 25% -- -0% -60% -61% -61% -64% kco2 3123870/s 1099% 25% 0% -- -60% -61% -61% -64% net1 7758408/s 2878% 211% 149% 148% -- -3% -3% -10% hau1 7965224/s 2957% 220% 156% 155% 3% -- -0% -7% ner1 7965224/s 2957% 220% 156% 155% 3% 0% -- -7% ner2 8601600/s 3202% 245% 176% 175% 11% 8% 8% -- ---------------------------------------- Bench (Length: 2100) ======================================== Rate net2 kco1 kco2 kco3 hau1 net1 ner1 ner2 net2 257948/s -- -90% -92% -92% -97% -97% -97% -97% kco1 2501079/s 870% -- -19% -20% -67% -68% -68% -71% kco2 3097753/s 1101% 24% -- -1% -59% -60% -60% -64% kco3 3133787/s 1115% 25% 1% -- -58% -60% -60% -64% hau1 7510752/s 2812% 200% 142% 140% -- -3% -4% -13% net1 7771227/s 2913% 211% 151% 148% 3% -- -0% -10% ner1 7794563/s 2922% 212% 152% 149% 4% 0% -- -10% ner2 8628990/s 3245% 245% 179% 175% 15% 11% 11% -- ---------------------------------------- Bench (Length: 21000) ======================================== Rate net2 kco1 kco2 kco3 net1 ner1 hau1 ner2 net2 231838/s -- -91% -93% -93% -97% -97% -97% -97% kco1 2478677/s 969% -- -20% -22% -67% -68% -69% -70% kco2 3094492/s 1235% 25% -- -3% -59% -61% -62% -63% kco3 3178304/s 1271% 28% 3% -- -58% -59% -61% -62% net1 7593413/s 3175% 206% 145% 139% -- -3% -6% -9% ner1 7840377/s 3282% 216% 153% 147% 3% -- -3% -6% hau1 8068001/s 3380% 225% 161% 154% 6% 3% -- -4% ner2 8382339/s 3516% 238% 171% 164% 10% 7% 4% -- ---------------------------------------- Bench (Length: 210000) ======================================== Rate net2 kco1 kco2 kco3 hau1 ner1 net1 ner2 net2 250558/s -- -85% -89% -92% -96% -97% -97% -97% kco1 1640135/s 555% -- -28% -46% -75% -78% -78% -81% kco2 2269326/s 806% 38% -- -25% -66% -70% -70% -73% kco3 3032215/s 1110% 85% 34% -- -54% -59% -60% -65% hau1 6657718/s 2557% 306% 193% 120% -- -11% -13% -22% ner1 7484171/s 2887% 356% 230% 147% 12% -- -2% -13% net1 7612303/s 2938% 364% 235% 151% 14% 2% -- -11% ner2 8558296/s 3316% 422% 277% 182% 29% 14% 12% -- ---------------------------------------- [download] "These opinions are my own, though for a small fee they be yours too."	[reply] [d/l] [select]
Re^3: Extracting /regex/mg in a while loop by kcott (Archbishop) on Oct 11, 2023 at 09:36 UTC
"My results are different than yours:" Well, that's hardly surprising. Surely the fact that none of the tests output anything raised a red flag. I put the code you posted in `op_bench1.pl` and ran it: `$ ./op_bench1.pl Missing right curly or square bracket at ./op_bench1.pl line 121, at e +nd of line syntax error at ./op_bench1.pl line 121, at EOF Execution of ./op_bench1.pl aborted due to compilation errors.` [download] I copied `op_bench1.pl` to `op_bench1_mod1.pl`; added `return; }` [download] to the end of the file and ran it: $ ./op_bench1_mod1.pl Test ======================================== ---------------------------------------- ---------------------------------------- ---------------------------------------- ---------------------------------------- ---------------------------------------- ---------------------------------------- ---------------------------------------- ---------------------------------------- Bench (Length: 21) ======================================== Rate net2 kco1 kco3 kco2 net1 hau1 ner1 ner2 net2 98212/s -- -92% -94% -94% -98% -98% -98% -98% kco1 1241107/s 1164% -- -28% -29% -78% -78% -79% -80% kco3 1727274/s 1659% 39% -- -1% -69% -70% -70% -72% kco2 1751536/s 1683% 41% 1% -- -69% -69% -70% -71% net1 5641441/s 5644% 355% 227% 222% -- -0% -3% -7% hau1 5668650/s 5672% 357% 228% 224% 0% -- -3% -7% ner1 5845326/s 5852% 371% 238% 234% 4% 3% -- -4% ner2 6072829/s 6083% 389% 252% 247% 8% 7% 4% -- ---------------------------------------- Bench (Length: 210) ======================================== [download] I hit Ctrl-C at that point. I copied `op_bench1_mod1.pl` to `op_bench1_mod2.pl`. I added `use strict; use warnings;` [download] after `#use v5.38;` [download] and ran it: $ ./op_bench1_mod2.pl Illegal character in prototype for main::net1 : $rstr, $out_fh = STDO +UT at ./op_bench1_mod2.pl line 48. Illegal character after '_' in prototype for main::net1 : $rstr, $out_ +fh = STDOUT at ./op_bench1_mod2.pl line 48. Illegal character in prototype for main::net2 : $rstr, $out_fh = STDO +UT at ./op_bench1_mod2.pl line 58. Illegal character after '_' in prototype for main::net2 : $rstr, $out_ +fh = STDOUT at ./op_bench1_mod2.pl line 58. Illegal character in prototype for main::hau1 : $rstr, $out_fh = STDO +UT at ./op_bench1_mod2.pl line 71. Illegal character after '_' in prototype for main::hau1 : $rstr, $out_ +fh = STDOUT at ./op_bench1_mod2.pl line 71. Illegal character in prototype for main::ner1 : $rstr, $out_fh = STDO +UT at ./op_bench1_mod2.pl line 81. Illegal character after '_' in prototype for main::ner1 : $rstr, $out_ +fh = STDOUT at ./op_bench1_mod2.pl line 81. Illegal character in prototype for main::ner2 : $rstr, $out_fh = STDO +UT at ./op_bench1_mod2.pl line 89. Illegal character after '_' in prototype for main::ner2 : $rstr, $out_ +fh = STDOUT at ./op_bench1_mod2.pl line 89. Global symbol "$rstr" requires explicit package name (did you forget t +o declare "my $rstr"?) at ./op_bench1_mod2.pl line 49. Global symbol "$out_fh" requires explicit package name (did you forget + to declare "my $out_fh"?) at ./op_bench1_mod2.pl line 52. Global symbol "$rstr" requires explicit package name (did you forget t +o declare "my $rstr"?) at ./op_bench1_mod2.pl line 59. Global symbol "$out_fh" requires explicit package name (did you forget + to declare "my $out_fh"?) at ./op_bench1_mod2.pl line 65. Global symbol "$rstr" requires explicit package name (did you forget t +o declare "my $rstr"?) at ./op_bench1_mod2.pl line 72. Global symbol "$out_fh" requires explicit package name (did you forget + to declare "my $out_fh"?) at ./op_bench1_mod2.pl line 75. Global symbol "$rstr" requires explicit package name (did you forget t +o declare "my $rstr"?) at ./op_bench1_mod2.pl line 82. Global symbol "$out_fh" requires explicit package name (did you forget + to declare "my $out_fh"?) at ./op_bench1_mod2.pl line 84. Global symbol "$rstr" requires explicit package name (did you forget t +o declare "my $rstr"?) at ./op_bench1_mod2.pl line 90. Global symbol "$out_fh" requires explicit package name (did you forget + to declare "my $out_fh"?) at ./op_bench1_mod2.pl line 92. ./op_bench1_mod2.pl has too many errors. [download] I copied `op_bench1_mod2.pl` to `op_bench1_mod3.pl`. Then changed `#use v5.38; use strict; use warnings; use feature qw\|say\|;` [download] to `use v5.34; use warnings; use experimental 'signatures';` [download] The names of the subroutines identified the author (first three characters of username in lower-case) and the order of suggested solutions (a single digit). I changed `my @bench_names = qw{net1 net2 hau1 ner1 ner2 kco1 kco2 kco3};` [download] to `my @bench_names = qw{net1 net2 hau1 ner1 ner2 kco1 kco2 net3};` [download] You'd placed `kco3()` between `kco1()` and `kco2()`: I initially had difficulty locating it as I expected it to be added to the end. You also had it erroneously reporting that it was `kco2`: `sub kco3 ($rstr, $out_fh = STDOUT) { my $str = $$rstr; my $re = qr{(?m:^(A\|B)(.+)$)}; while ($str =~ /$re/go){ $out_fh->say("$1 kco2 $2") ; } return; }` [download] I moved it to the end and fixed those problems. The last code in the script is now: `sub net3 ($rstr, $out_fh = STDOUT) { my $str = $$rstr; my $re = qr{(?m:^(A\|B)(.+)$)}; while ($str =~ /$re/go){ $out_fh->say("$1 net3 $2"); } return; }` [download] I then ran this script: $ ./op_bench1_mod3.pl Test ======================================== A net1 dataA B net1 dataB ---------------------------------------- A net2 dataA B net2 dataB ---------------------------------------- A hau1 dataA B hau1 dataB ---------------------------------------- A ner1 dataA B ner1 dataB ---------------------------------------- A ner2 dataA B ner2 dataB ---------------------------------------- A kco1 dataA B kco1 dataB ---------------------------------------- A kco2 dataA B kco2 dataB ---------------------------------------- A net3 dataA B net3 dataB ---------------------------------------- Bench (Length: 21) ======================================== Rate net2 hau1 kco1 net3 kco2 net1 ner2 ner1 net2 61894/s -- -64% -70% -75% -75% -75% -75% -78% hau1 170557/s 176% -- -17% -30% -31% -31% -32% -39% kco1 205902/s 233% 21% -- -15% -16% -17% -18% -27% net3 243209/s 293% 43% 18% -- -1% -2% -3% -14% kco2 246357/s 298% 44% 20% 1% -- -0% -2% -12% net1 247587/s 300% 45% 20% 2% 0% -- -1% -12% ner2 250431/s 305% 47% 22% 3% 2% 1% -- -11% ner1 281208/s 354% 65% 37% 16% 14% 14% 12% -- ---------------------------------------- Bench (Length: 210) ======================================== Rate hau1 net2 kco1 net1 ner2 net3 ner1 kco2 hau1 18776/s -- -1% -29% -33% -35% -39% -39% -39% net2 18871/s 1% -- -28% -33% -35% -38% -39% -39% kco1 26290/s 40% 39% -- -6% -9% -14% -15% -15% net1 28066/s 49% 49% 7% -- -3% -8% -9% -9% ner2 28885/s 54% 53% 10% 3% -- -6% -7% -7% net3 30649/s 63% 62% 17% 9% 6% -- -1% -1% ner1 30936/s 65% 64% 18% 10% 7% 1% -- -0% kco2 30946/s 65% 64% 18% 10% 7% 1% 0% -- ---------------------------------------- Bench (Length: 2100) ======================================== Rate hau1 net2 kco1 net1 ner2 ner1 net3 kco2 hau1 1900/s -- -21% -30% -33% -36% -39% -40% -41% net2 2397/s 26% -- -11% -16% -19% -23% -24% -25% kco1 2703/s 42% 13% -- -5% -9% -13% -14% -15% net1 2837/s 49% 18% 5% -- -5% -9% -10% -11% ner2 2973/s 56% 24% 10% 5% -- -5% -5% -7% ner1 3124/s 64% 30% 16% 10% 5% -- -1% -2% net3 3143/s 65% 31% 16% 11% 6% 1% -- -2% kco2 3199/s 68% 33% 18% 13% 8% 2% 2% -- ---------------------------------------- Bench (Length: 21000) ======================================== Rate hau1 net2 kco1 net1 ner2 net3 ner1 kco2 hau1 190/s -- -23% -29% -33% -37% -39% -40% -40% net2 247/s 30% -- -8% -13% -18% -21% -21% -22% kco1 269/s 42% 9% -- -5% -11% -14% -14% -15% net1 285/s 50% 15% 6% -- -6% -9% -9% -10% ner2 302/s 59% 23% 12% 6% -- -4% -4% -5% net3 313/s 65% 27% 16% 10% 4% -- -0% -1% ner1 314/s 65% 27% 17% 10% 4% 0% -- -1% kco2 317/s 67% 29% 18% 12% 5% 1% 1% -- ---------------------------------------- Bench (Length: 210000) ======================================== Rate hau1 net2 kco1 net1 ner2 net3 ner1 kco2 hau1 19.1/s -- -24% -30% -33% -37% -39% -39% -40% net2 25.1/s 31% -- -8% -12% -17% -20% -20% -22% kco1 27.2/s 42% 8% -- -4% -10% -13% -13% -15% net1 28.5/s 49% 13% 5% -- -5% -9% -9% -11% ner2 30.1/s 58% 20% 11% 6% -- -4% -4% -6% net3 31.4/s 64% 25% 15% 10% 4% -- -0% -2% ner1 31.4/s 64% 25% 15% 10% 4% 0% -- -2% kco2 32.0/s 68% 28% 18% 12% 6% 2% 2% -- ---------------------------------------- [download] Now the results more closely resemble mine. Here's `op_bench1_mod3.pl` in full: #!/usr/bin/env perl use v5.34; use warnings; use experimental 'signatures'; use autodie; use constant { SEP1 => "\n" . '=' x 40, SEP2 => '-' x 40, }; use Benchmark 'cmpthese'; use File::Spec (); my ($null_fh, $base_str); BEGIN { open $null_fh, '>', File::Spec::->devnull(); $base_str = <<~'EOT'; AdataA BdataB CdataC EOT } my @bench_names = qw{net1 net2 hau1 ner1 ner2 kco1 kco2 net3}; my (%bench_for, %benches_to_cmp, $mult_str); say 'Test', SEP1; for my $bench (@bench_names) { no strict 'refs'; ($bench_for{$bench} = \&{$bench})->(\$base_str); use strict 'refs'; say SEP2; $benches_to_cmp{$bench} = sub { $bench_for{$bench}->(\$mult_str, $null_fh); }; } for my $mult (1, 10, 100, 1_000, 10_000) { $mult_str = $base_str x $mult; say 'Bench (Length: ', length($mult_str), ')', SEP1; cmpthese 0 => \%benches_to_cmp; say SEP2; } sub net1 ($rstr, $out_fh = STDOUT) { my $str = $$rstr; while ($str =~ /(^A\|^B)(.+)$/mg) { $out_fh->say("$1 net1 $2"); } return; } sub net2 ($rstr, $out_fh = STDOUT) { my $str = $$rstr; open my $fh, '<', \$str; while (<$fh>) { my ($name, $data) = m/(^A\|^B)(.+)$/ or next; $out_fh->say("$name net2 $data"); } return; } sub hau1 ($rstr, $out_fh = STDOUT) { my $str = $$rstr; while ($str =~ m{^ (?<name>A\|B) (?<data>.+) $}xmg) { $out_fh->say("$+{name} hau1 $+{data}"); } return; } sub ner1 ($rstr, $out_fh = STDOUT) { my $str = $$rstr; $out_fh->say("$1 ner1 $2") while $str =~ /^([AB])(.+)$/mg; return; } sub ner2 ($rstr, $out_fh = STDOUT) { my $str = $$rstr; $str =~ /^([AB])(.++)$(?{ $out_fh->say("$1 ner2 $2") })./m; return; } sub kco1 ($rstr, $out_fh = STDOUT) { my $str = $$rstr; my $re = qr{(?m:^([AB])(.+)$)}; $out_fh->say("$1 kco1 $2") while $str =~ /$re/g; return; } sub kco2 ($rstr, $out_fh = STDOUT) { my $str = $$rstr; my $re = qr{(?m:^([AB])(.+)$)}; $out_fh->say("$1 kco2 $2") while $str =~ /$re/go; return; } sub net3 ($rstr, $out_fh = STDOUT) { my $str = $$rstr; my $re = qr{(?m:^(A\|B)(.+)$)}; while ($str =~ /$re/go){ $out_fh->say("$1 net3 $2"); } return; } [download] — Ken	[reply] [d/l] [select]