No. Paladin's solution has 2 operations for each piece of data: 1 match, 2 lookup. The original solution has anywhere from 1 to n (in this case, n=15) operations for each piece of data, depending on how soon the item matches. Paladin's is a constant O(2), while the original will average around O(n/2). A test:
use Benchmark;
my %names;
my (@list) =
qw(Jones Rogers Edwards Smith Jackson
Ryan Jones tilly dws paladin
footpad jeffa Elian ybiC TheDamian
);
@names{@list} = (1) x @list;
my $names = join '|', @list;
my $data = do {local $/; <DATA>};
timethese (
100_000, {
"paladin" =>
sub {
my $text = $data;
foreach my $name ($text=~/(\b(?:[A-Z](?:\.|[a-z]+)\s+)+(\w
++))/go){
"$name\n" if exists $names{$name}
}
},
"original" =>
sub {
my $text = $data;
foreach my $name ($text=~/(\b(?:[A-Z](?:\.|[a-z]+)\s+)+(?:
+$names))/sgo){
"$name\n"
}
}
});
__DATA__
Dr. Happy
Sr. Rogers
Senoir. Chacho
Senoira. Chachese
Mr. Ryan
Mrs. Smith
(I'm sorry) Ms. Jackson (oooh, I am for reaaal)
Dr. Tilly
Mr. Elian
Asdokfj. adfsdf
Ms. asdfasdf
Mr. Burns
Qsdokfj. adfsdf
q. TheDamian
Hello. There
This. Should
Not. Fail
And the results:
Benchmark: timing 100000 iterations of optimized, original, paladin...
original: 25 wallclock secs (23.14 usr + 0.00 sys = 23.14 CPU) @ 43
+20.77/s (n=100000)
paladin: 18 wallclock secs (16.93 usr + 0.00 sys = 16.93 CPU) @ 59
+05.63/s (n=100000)
In response to your update, I think you are mistaken; "end," doesn't match anywhere at all.
|