in reply to Extracting /regex/mg in a while loop
G'day NetWallah,
Here's an actual benchmark, using everyone's code (posted to date) plus a couple of my own. I've aimed to keep as close to the originals as possible.
Update: My apologies. I made a huge mistake with the code I originally posted. I've stricken that code and placed it in the spoiler below. Here's a largely rewritten version which works correctly.
#!/usr/bin/env perl use v5.38; use autodie; use constant { SEP1 => "\n" . '=' x 40, SEP2 => '-' x 40, }; use Benchmark 'cmpthese'; use File::Spec (); my ($null_fh, $base_str); BEGIN { open $null_fh, '>', File::Spec::->devnull(); $base_str = <<~'EOT'; AdataA BdataB CdataC EOT } my @bench_names = qw{net1 net2 hau1 ner1 ner2 kco1 kco2}; my (%bench_for, %benches_to_cmp, $mult_str); say 'Test', SEP1; for my $bench (@bench_names) { no strict 'refs'; ($bench_for{$bench} = \&{$bench})->(\$base_str); use strict 'refs'; say SEP2; $benches_to_cmp{$bench} = sub { $bench_for{$bench}->(\$mult_str, $null_fh); }; } for my $mult (1, 10, 100, 1_000, 10_000) { $mult_str = $base_str x $mult; say 'Bench (Length: ', length($mult_str), ')', SEP1; cmpthese 0 => \%benches_to_cmp; say SEP2; } sub net1 ($rstr, $out_fh = *STDOUT) { my $str = $$rstr; while ($str =~ /(^A|^B)(.+)$/mg){ $out_fh->say("$1 net1 $2"); } return; } sub net2 ($rstr, $out_fh = *STDOUT) { my $str = $$rstr; open my $fh, '<', \$str; while (<$fh>) { my ($name, $data) = m/(^A|^B)(.+)$/ or next; $out_fh->say("$name net2 $data"); } return; } sub hau1 ($rstr, $out_fh = *STDOUT) { my $str = $$rstr; while ($str =~ m{^ (?<name>A|B) (?<data>.+) $}xmg) { $out_fh->say("$+{name} hau1 $+{data}"); } return; } sub ner1 ($rstr, $out_fh = *STDOUT) { my $str = $$rstr; $out_fh->say("$1 ner1 $2") while $str =~ /^([AB])(.+)$/mg; return; } sub ner2 ($rstr, $out_fh = *STDOUT) { my $str = $$rstr; $str =~ /^([AB])(.++)$(?{ $out_fh->say("$1 ner2 $2") })./m; return; } sub kco1 ($rstr, $out_fh = *STDOUT) { my $str = $$rstr; state $re = qr{(?m:^([AB])(.+)$)}; $out_fh->say("$1 kco1 $2") while $str =~ /$re/g; return; } sub kco2 ($rstr, $out_fh = *STDOUT) { my $str = $$rstr; state $re = qr{(?m:^([AB])(.+)$)}; $out_fh->say("$1 kco2 $2") while $str =~ /$re/go; return; }
Here's the output from a sample run:
Test ======================================== A net1 dataA B net1 dataB ---------------------------------------- A net2 dataA B net2 dataB ---------------------------------------- A hau1 dataA B hau1 dataB ---------------------------------------- A ner1 dataA B ner1 dataB ---------------------------------------- A ner2 dataA B ner2 dataB ---------------------------------------- A kco1 dataA B kco1 dataB ---------------------------------------- A kco2 dataA B kco2 dataB ---------------------------------------- Bench (Length: 21) ======================================== Rate net2 hau1 kco1 net1 ner2 ner1 kco2 net2 99524/s -- -41% -55% -59% -59% -63% -64% hau1 168726/s 70% -- -24% -31% -31% -38% -39% kco1 221638/s 123% 31% -- -9% -9% -18% -20% net1 242811/s 144% 44% 10% -- -1% -10% -12% ner2 244388/s 146% 45% 10% 1% -- -10% -12% ner1 270164/s 171% 60% 22% 11% 11% -- -3% kco2 277481/s 179% 64% 25% 14% 14% 3% -- ---------------------------------------- Bench (Length: 210) ======================================== Rate hau1 net2 kco1 net1 ner2 ner1 kco2 hau1 18152/s -- -14% -31% -35% -36% -40% -41% net2 21115/s 16% -- -20% -24% -26% -30% -31% kco1 26414/s 46% 25% -- -5% -7% -12% -14% net1 27930/s 54% 32% 6% -- -2% -7% -9% ner2 28424/s 57% 35% 8% 2% -- -5% -8% ner1 30063/s 66% 42% 14% 8% 6% -- -2% kco2 30792/s 70% 46% 17% 10% 8% 2% -- ---------------------------------------- Bench (Length: 2100) ======================================== Rate hau1 net2 kco1 net1 ner2 ner1 kco2 hau1 1826/s -- -22% -30% -35% -37% -41% -41% net2 2342/s 28% -- -10% -17% -19% -24% -25% kco1 2597/s 42% 11% -- -8% -10% -16% -17% net1 2808/s 54% 20% 8% -- -3% -9% -10% ner2 2882/s 58% 23% 11% 3% -- -7% -7% ner1 3088/s 69% 32% 19% 10% 7% -- -1% kco2 3113/s 70% 33% 20% 11% 8% 1% -- ---------------------------------------- Bench (Length: 21000) ======================================== Rate hau1 net2 kco1 net1 ner2 ner1 kco2 hau1 185/s -- -23% -30% -33% -36% -39% -40% net2 241/s 30% -- -9% -13% -17% -21% -22% kco1 265/s 43% 10% -- -4% -9% -13% -14% net1 277/s 50% 15% 5% -- -4% -9% -10% ner2 290/s 57% 20% 9% 5% -- -5% -6% ner1 305/s 65% 26% 15% 10% 5% -- -1% kco2 308/s 66% 28% 16% 11% 6% 1% -- ---------------------------------------- Bench (Length: 210000) ======================================== Rate hau1 net2 kco1 net1 ner2 kco2 ner1 hau1 18.3/s -- -26% -31% -35% -38% -41% -41% net2 24.8/s 36% -- -7% -12% -16% -20% -21% kco1 26.6/s 45% 7% -- -6% -10% -15% -15% net1 28.2/s 54% 13% 6% -- -5% -9% -10% ner2 29.5/s 61% 19% 11% 5% -- -5% -5% kco2 31.1/s 70% 25% 17% 10% 5% -- -0% ner1 31.2/s 71% 26% 18% 11% 6% 0% -- ----------------------------------------
The 'use v5.38;' represents the version I'm using. You can change that to v5.36 without needing to change any other parts of the code. And, of course, you can wind it back to earlier versions: the further back you go, the more changes that will be needed.
CAVEAT for MSWin users: I expect the /dev/null on line 14 will need changing; however, I don't know what would be appropriate.
Update: See my updated reply to haj — that CAVEAT is no longer valid for the updated code.
— Ken
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^2: Extracting /regex/mg in a while loop
by haj (Vicar) on Oct 10, 2023 at 11:54 UTC | |
by kcott (Archbishop) on Oct 10, 2023 at 16:20 UTC | |
Re^2: Extracting /regex/mg in a while loop
by Marshall (Canon) on Oct 10, 2023 at 19:18 UTC | |
by kcott (Archbishop) on Oct 10, 2023 at 23:04 UTC | |
Re^2: Extracting /regex/mg in a while loop
by NetWallah (Canon) on Oct 11, 2023 at 05:54 UTC | |
by kcott (Archbishop) on Oct 11, 2023 at 09:36 UTC |