# THIS doesnt mean \\server_name\sys_share it means \server_name\sys_s
+hare
$unc = '\\server_name\sys_share';
# and then you remove the leading "\" and trailing "\sys_share" before
+ the benchmark even starts!
$unc =~ s/^\W*\w+//;
$server = $&;
$server =~ s/^\W+//;
# None of the regexes in the benchmark will match anymore (in a meanin
+gful way)
So to do the benchmark properly I modified your code
#!perl
use warnings;
use Benchmark;
$unc = '\\\\server_name\\sys_share';
my $re = Benchmark::timethese(-5,
{
blackadder => sub {
$lunc = $unc;
$lunc =~ s/^\W*\w+//;
$server = $&;
$server =~ s/^\W+//;
},
theorbtwo => sub {
$unc =~ m/^\\\\([^\\]+)\\/;
$server = $1;
}
}
);
Benchmark::cmpthese($re);
__END__
Benchmark: running blackadder, theorbtwo, each for at least 5 CPU seco
+nds...
blackadder: 6 wallclock secs ( 5.12 usr + 0.00 sys = 5.12 CPU) @ 13
+1371.90/s (n=673281)
theorbtwo: 5 wallclock secs ( 5.34 usr + 0.00 sys = 5.34 CPU) @ 19
+0423.65/s (n=1017624)
Rate blackadder theorbtwo
blackadder 131372/s -- -31%
theorbtwo 190424/s 45% --
Which shows that your method is faster than blackadders but not too much, only 45%. Luckily for you this number is _still_ totally wrong. The reason is because $& has an interesting effect on regexes _ANYWHERE_ in a program that uses $&, namely it slows them down massively. (japhy has written a number of articles about this, and some approachs to resolve the problem.) So in order to benchmark a solution that use $& against one that doesnt we will need to benchmark them in different perl processes (not forked! totally different), like so:
Program 1: bm_blackadder.pl
# benchmark saw ampersand -- BlackAdder
use strict;
use warnings;
use Benchmark qw(timethis);
use Data::Dumper;
my $count=$ARGV[0] || -1;
my $unc =$ARGV[1] || '\\\\server_name\\sys_share';
print "Matching $unc for $count\n";
print Dumper(timethis($count,sub {
my $lunc = $unc;
$lunc =~ s/^\W*\w+//;
(my $server = $&)=~ s/^\W+//;
$server
},'blackadder'
));
Program 2: bm_theorbtwo.pm
# benchmark saw ampersand -- theorbtwo
use strict;
use warnings;
use Benchmark qw(timethis);
use Data::Dumper;
my $count=$ARGV[0] || -1;
my $unc =$ARGV[1] || '\\\\server_name\\sys_share';
print "Matching $unc for $count\n";
print Dumper(timethis($count,sub {
$unc =~ m/^\\\\([^\\]+)\\/;
$1;
}, 'theorbtwo'
));
Program 3: run_bm.pl (Run the others and return the results of both, compared together.)
use Benchmark 'cmpthese';
use Data::Dumper;
sub run_bm($){
my $str=shift;
my $h;
$str=~s/\A(.*)\$VAR1 =/$h=$1;''/se;
print $h;
my $v=eval($str);
die $@ if $@;
$v
}
my $opts='-5 \\\\foo\\bar\\baz.exe;
my $hash={
blackadder => run_bm(`perl bm_blackadder.pl $opts`),
theorbtwo => run_bm(`perl bm_theorbtwo.pl $opts`),
};
cmpthese($hash);
__END__
Which when set up correclty run_bm.pl outputs
Matching \\foo\bar\baz.exe for -5
blackadder: 6 wallclock secs ( 5.22 usr + 0.00 sys = 5.22 CPU) @ 13
+5204.60/s (n=705768)
Matching \\foo\bar\baz.exe for -5
theorbtwo: 6 wallclock secs ( 5.17 usr + 0.00 sys = 5.17 CPU) @ 36
+7160.48/s (n=1898954)
Rate blackadder theorbtwo
blackadder 135205/s -- -63%
theorbtwo 367160/s 172% --
Showing that your solution is about %172 faster than blackadders! Much better than the %50 faster that you might have thought it was! (And also showing the cost that $& has on your code if you are foolish enough to use it, or if someone else has snuck it into their code and you dont know about it)
BTW, You are correct that blackadders solution is not correct, my point in this reply is that benchmarking regexes that use $& is not as simple as one might think (or like).
Oh also in the future you should avoid using fixed counts in your benchmark. Almost always it is better to use negative numbers indicating how long to benchmark for. The more seconds the better (I find usually 5-10 seconds is good)
HTH.
Yves / DeMerphq
---
Software Engineering is Programming when you can't. -- E. W. Dijkstra (RIP)
|