update One of the reasons I wrote this reply was to illustrate the issues writing a good benchmark, specifically relating to bencharking things with and without $& in them. But it seems that I made a number of mistakes myself. Make sure you read Abigail-IIs comments below after you read this post, some of what I say turns out to be wrong.


While I agree that benchmarking your solution against blackadders is an interesting thought I have to point out that unfortunately the way you have done it will produce results that are both incorrect (because your benchmark doesnt do what you think it does) and misleading because you arent testing fairly (that dagnatted $& has bitten you in debugger :-).

So lets look at the problems with your benchamrking code:

# THIS doesnt mean \\server_name\sys_share it means \server_name\sys_s +hare $unc = '\\server_name\sys_share'; # and then you remove the leading "\" and trailing "\sys_share" before + the benchmark even starts! $unc =~ s/^\W*\w+//; $server = $&; $server =~ s/^\W+//; # None of the regexes in the benchmark will match anymore (in a meanin +gful way)
So to do the benchmark properly I modified your code
#!perl use warnings; use Benchmark; $unc = '\\\\server_name\\sys_share'; my $re = Benchmark::timethese(-5, { blackadder => sub { $lunc = $unc; $lunc =~ s/^\W*\w+//; $server = $&; $server =~ s/^\W+//; }, theorbtwo => sub { $unc =~ m/^\\\\([^\\]+)\\/; $server = $1; } } ); Benchmark::cmpthese($re); __END__ Benchmark: running blackadder, theorbtwo, each for at least 5 CPU seco +nds... blackadder: 6 wallclock secs ( 5.12 usr + 0.00 sys = 5.12 CPU) @ 13 +1371.90/s (n=673281) theorbtwo: 5 wallclock secs ( 5.34 usr + 0.00 sys = 5.34 CPU) @ 19 +0423.65/s (n=1017624) Rate blackadder theorbtwo blackadder 131372/s -- -31% theorbtwo 190424/s 45% --
Which shows that your method is faster than blackadders but not too much, only 45%. Luckily for you this number is _still_ totally wrong. The reason is because $& has an interesting effect on regexes _ANYWHERE_ in a program that uses $&, namely it slows them down massively. (japhy has written a number of articles about this, and some approachs to resolve the problem.) So in order to benchmark a solution that use $& against one that doesnt we will need to benchmark them in different perl processes (not forked! totally different), like so:

Program 1: bm_blackadder.pl

# benchmark saw ampersand -- BlackAdder use strict; use warnings; use Benchmark qw(timethis); use Data::Dumper; my $count=$ARGV[0] || -1; my $unc =$ARGV[1] || '\\\\server_name\\sys_share'; print "Matching $unc for $count\n"; print Dumper(timethis($count,sub { my $lunc = $unc; $lunc =~ s/^\W*\w+//; (my $server = $&)=~ s/^\W+//; $server },'blackadder' ));
Program 2: bm_theorbtwo.pm
# benchmark saw ampersand -- theorbtwo use strict; use warnings; use Benchmark qw(timethis); use Data::Dumper; my $count=$ARGV[0] || -1; my $unc =$ARGV[1] || '\\\\server_name\\sys_share'; print "Matching $unc for $count\n"; print Dumper(timethis($count,sub { $unc =~ m/^\\\\([^\\]+)\\/; $1; }, 'theorbtwo' ));
Program 3: run_bm.pl
(Run the others and return the results of both, compared together.)
use Benchmark 'cmpthese'; use Data::Dumper; sub run_bm($){ my $str=shift; my $h; $str=~s/\A(.*)\$VAR1 =/$h=$1;''/se; print $h; my $v=eval($str); die $@ if $@; $v } my $opts='-5 \\\\foo\\bar\\baz.exe; my $hash={ blackadder => run_bm(`perl bm_blackadder.pl $opts`), theorbtwo => run_bm(`perl bm_theorbtwo.pl $opts`), }; cmpthese($hash); __END__
Which when set up correclty run_bm.pl outputs
Matching \\foo\bar\baz.exe for -5 blackadder: 6 wallclock secs ( 5.22 usr + 0.00 sys = 5.22 CPU) @ 13 +5204.60/s (n=705768) Matching \\foo\bar\baz.exe for -5 theorbtwo: 6 wallclock secs ( 5.17 usr + 0.00 sys = 5.17 CPU) @ 36 +7160.48/s (n=1898954) Rate blackadder theorbtwo blackadder 135205/s -- -63% theorbtwo 367160/s 172% --
Showing that your solution is about %172 faster than blackadders! Much better than the %50 faster that you might have thought it was! (And also showing the cost that $& has on your code if you are foolish enough to use it, or if someone else has snuck it into their code and you dont know about it)

BTW, You are correct that blackadders solution is not correct, my point in this reply is that benchmarking regexes that use $& is not as simple as one might think (or like). Oh also in the future you should avoid using fixed counts in your benchmark. Almost always it is better to use negative numbers indicating how long to benchmark for. The more seconds the better (I find usually 5-10 seconds is good)

HTH.

Yves / DeMerphq
---
Software Engineering is Programming when you can't. -- E. W. Dijkstra (RIP)


In reply to Unfortunately benchmarking things with $& isn't easy... by demerphq
in thread Obtaining server name from UNC path by blackadder

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.