in reply to Algorithm to search and replace on data file, based on a table file?
Another way to solve your problem would be to search and replace all the patterns simultaneously in one regular expression instead of doing them sequentially. This will also give you a fairly substantial performance boost, since instead of invoking the regular expression engine 600 times for every line you just do it once per line.
Since it sounds like your patterns are just strings to match against, you can probably get away with joining them all together separated by |. Just note that this is pretty inefficient pre-Perl 5.10, although it is probably faster than what you have now since it moves the loop into the regular expression engine's C code. If you haven't moved to 5.10 yet, look into Regexp::Assemble to create the regular expression.
Your code would then look something like this:
# In your code you are sorting on the length of the replacement string +s # instead of the search strings. I'm guessing that's not what you wan +t, # so I switched it to sort on the search patterns. my @keys_ordered = sort { length $b <=> length $a } keys %$table_ref; # Join your strings into one big long RE my $re_string = join '|', map( qr/\Q$_\E/, @keys_ordered ); my $re = qr/($re_string)/; my $replacecount = 0; while ( my $line = <INFILE> ) { # The inner loop is gone $replacecount += ( $line =~ s/$re/$table_ref->{$1}/g ); print OUTFILE $line; } print "Made $replacecount replacements.\n";
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Algorithm to search and replace on data file, based on a table file?
by dolmen (Beadle) on Sep 24, 2009 at 08:46 UTC | |
|
Re^2: Algorithm to search and replace on data file, based on a table file?
by koknat (Sexton) on Sep 24, 2009 at 21:31 UTC | |
by dirving (Friar) on Sep 25, 2009 at 02:19 UTC |