Re: Efficient string tokenization and substitution

I think your way is _not_ inefficient! I would do it nearly the same way. The modifier i is not needed on the first solution. Also I removed the "" in the re.

$string =~
s/([^\s\.\]\[]+)/exists($tokens_to_match{lc $1}) ? $tokens_to_match{lc
+ $1} : $1/ge;
[download]

or ( if you have a lot of tokens this might be faster )

my $str  = join '|',
  sort { length $b <=> length $a || $a cmp $b } keys %tokens_to_match;
$string =~
  s/(\b|\s|\.|\[|\])($str)(?>(\b|\s|\.|\[|\]))/
    $1 . ( exists($tokens_to_match{lc $2}) 
      ? $tokens_to_match{lc $2} 
      : $2
    )
  /gei;
[download]

Boris

Comment on Re: Efficient string tokenization and substitution Select or Download Code