#!/usr/bin/perl
use Benchmark qw/cmpthese/;
$defaulttext = q/foo / x 30;
# $defaulttext = q/foobar / x 30;
cmpthese( 100_000, {
slash_b => q{$text=$defaulttext; $text =~ s/\bfoo\b//g;},
neg_look=> q{$text=$defaulttext; $text =~ s/(?<!\w)foo(?!\w)//g;},
pos_look=> q{$text=$defaulttext; $text =~ s/(?<=[^\w])foo(?=[^\w])
+//g;},
});
With $defaulttext being 'foo foo ...' all three methods
take approx. the same time, the changing of $text takes a
decisive amount of time.
With $defaulttext being 'foobar foobar ...' - i.e. no
replacements are done - I get the following results:
Rate pos_look neg_look slash_b
pos_look 27894/s -- -8% -35%
neg_look 30441/s 9% -- -29%
slash_b 42662/s 53% 40% --
This shows that the \b variant is about 50% quicker and
the negative lookaround is better than the negated character
class.
But the most important difference can be seen
from the following code
$text= q/foo bar foo/;
($tmp = $text) =~ s/\bfoo\b//g;
print $tmp,"\n";
($tmp = $text) =~ s/(?<!\w)foo(?!\w)//g;
print $tmp,"\n";
($tmp = $text) =~ s/(?<=[^\w])foo(?=[^\w])//g;
print $tmp,"\n";
# which prints:
bar
bar
foo bar foo
The positive lookaround does not behave like the others
at the boundaries of the string. This is because the
positive lookaround looks for a character (class) but - as there is no character before the beginning of the string or after the end - it fails. The negative lookaround works even if no
character is there.
-- Hofmator
|