in reply to Efficient string tokenization and substitution

I think your way is _not_ inefficient! I would do it nearly the same way. The modifier i is not needed on the first solution. Also I removed the "" in the re.
$string =~ s/([^\s\.\]\[]+)/exists($tokens_to_match{lc $1}) ? $tokens_to_match{lc + $1} : $1/ge;
or ( if you have a lot of tokens this might be faster )
my $str = join '|', sort { length $b <=> length $a || $a cmp $b } keys %tokens_to_match; $string =~ s/(\b|\s|\.|\[|\])($str)(?>(\b|\s|\.|\[|\]))/ $1 . ( exists($tokens_to_match{lc $2}) ? $tokens_to_match{lc $2} : $2 ) /gei;
Boris