Another way to solve your problem would be to search and replace all the patterns simultaneously in one regular expression instead of doing them sequentially. This will also give you a fairly substantial performance boost, since instead of invoking the regular expression engine 600 times for every line you just do it once per line.

Since it sounds like your patterns are just strings to match against, you can probably get away with joining them all together separated by |. Just note that this is pretty inefficient pre-Perl 5.10, although it is probably faster than what you have now since it moves the loop into the regular expression engine's C code. If you haven't moved to 5.10 yet, look into Regexp::Assemble to create the regular expression.

Your code would then look something like this:

# In your code you are sorting on the length of the replacement string +s # instead of the search strings. I'm guessing that's not what you wan +t, # so I switched it to sort on the search patterns. my @keys_ordered = sort { length $b <=> length $a } keys %$table_ref; # Join your strings into one big long RE my $re_string = join '|', map( qr/\Q$_\E/, @keys_ordered ); my $re = qr/($re_string)/; my $replacecount = 0; while ( my $line = <INFILE> ) { # The inner loop is gone $replacecount += ( $line =~ s/$re/$table_ref->{$1}/g ); print OUTFILE $line; } print "Made $replacecount replacements.\n";
-- David Irving

In reply to Re: Algorithm to search and replace on data file, based on a table file? by dirving
in thread Algorithm to search and replace on data file, based on a table file? by koknat

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.