in reply to How to speed up multiple regex in a loop for a big data?

Your code contains a problem if you have "overlap".in your source or target variable names. You should create one large regular expression that willsearch and replace the names in one go to avoid circles/sequences of replacing names instead of looping over your replacements per line:

my $re = join '\bŠ\b', reverse keys %mapper; while (<IN>){ s/\b($re)\b/$mapper{$1}/gei; };

There are modules for conveniently building such regular expressions in a more optimal way, like Regex::PreSuf

Replies are listed 'Best First'.
Re^2: How to speed up multiple regex in a loop for a big data?
by MonkInPleasanton (Initiate) on May 25, 2006 at 06:31 UTC
    I thought \b \b pairs will do the trick since they will match word boundaries. FYI, my mapping file has no redundant entries. Please correct me if I am wrong

      You're correct with the \b word boundaries. The problem case I am thinking of is the following renaming setup:

      %mapper = ( foo => 'zap', zap => 'foo', )

      Here, foo will be replaced by "zap" in your loop and then again by "foo". But if that can't happen, all you'll gain with the large regex is speed ;)

        Theritically, you are right. However, new names will have suffixes and prefixes around old variable names in my case. It won't happen. Thanks though.
Re^2: How to speed up multiple regex in a loop for a big data?
by planetscape (Chancellor) on May 26, 2006 at 03:29 UTC