in reply to hashes in regexes
As a lowly acolyte I have great respect for <kbd>map</kbd> and a fondness of good old <kbd>foreach</kbd>. So I naturally thought: isn't a loop with simple pattens faster than a pattern with <kbd>|</kbd> in it?
Here's my benchmarking code:
use Benchmark; timethese(500, { 'or pattern' => \&orpattern, 'many patterns' => \&manypatterns }); sub init { %sub_hash = (John => Mike, Jack => Mark, Joe => Moe); $string = "Dear John!.\nI've run off with Jack and Joe.\nSue\n\n" x 1000; } sub orpattern { init; my $keys = join '|', map "\Q$_\E", keys %sub_hash; my $keys_REx = qr/$keys/; $string =~ s/($keys_REx)/$sub_hash{$1}/g; return $string; } sub manypatterns { init; my %patterns = map { qr/$_/ => $sub_hash{$_} } keys %sub_hash; foreach $pat (keys %patterns) { my $replace = $patterns{$pat}; $string =~ s/$pat/$replace/g; } return $string; }
And the results:
Benchmark: timing 500 iterations of many patterns, or pattern... many patterns: 2 wallclock secs ( 1.61 usr + 0.00 sys = 1.61 CPU) or pattern: 12 wallclock secs (11.94 usr + 0.01 sys = 11.95 CPU)
Of course there's a difference in functionality between the two, just think of:
%substitute_hash = ( Jack => Chris, Chris => Jaquline );
-- Brigitte 'I never met a chocolate I didnt like' Jellinek http://www.horus.com/~bjelli/ http://perlwelt.horus.at
|
---|