in reply to timtowtdi but which is the [quote fingers]best[/quote fingers]

Compared to IO time any technique from carving on a stone tablet up is likely to be fast enough. Until you see a speed problem in practice with real data don't worry about it. If you are worried, profile the code to find out where the time is actually spent rather than guessing. And always remember that a change of algorithm will generally get better results than fiddling with details.

Having said that, the following technique may be of interest:

#!/usr/bin/perl -w use warnings; use strict; my %translate = ( "string a" => 1, "string b" => 2, "string c" => 3, true => 1, false => 0, ); my $match = "\Q" . join ("\E|\Q", keys %translate) . "\E"; while (<DATA>) { s/($match)/$translate{lc $1}/ige; print; } __DATA__ what is the best way to translate "TRUE" and "FALSE" to "1" or "0" res +pectively? or "string a", "String B", "String C" to "1-3" respectively.

Prints:

what is the best way to translate "1" and "0" to "1" or "0" respective +ly? or "1", "2", "3" to "1-3" respectively.
True laziness is hard work

Replies are listed 'Best First'.
Re^2: timtowtdi but which is the [quote fingers]best[/quote fingers]
by AnomalousMonk (Archbishop) on Apr 14, 2011 at 18:26 UTC
    my $match = "\Q" . join ("\E|\Q", keys %translate) . "\E";

    There's a problem here. In interpolated strings, the  \Q and  \E escapes act as interpolation 'controls' (if that's the right term) and softly and silently vanish away. Except for raising a ruckus, using non-interpolating strings doesn't help matters:

    >perl -wMstrict -le "my %translate = ( 'tr??u++e' => 1, 'fa+l?se' => 0, ); ;; my $match = qq{\Q} . join(qq{\E|\Q}, keys %translate) . qq{\E}; print qq{'$match'}; $match = qr{$match}; print $match; ;; my $mooch = '\Q' . join('\E|\Q', keys %translate) . '\E'; print qq{'$mooch'}; $mooch = qr{$mooch}; print $mooch; " 'tr??u++e|fa+l?se' (?-xism:tr??u++e|fa+l?se) '\Qtr??u++e\E|\Qfa+l?se\E' Unrecognized escape \Q passed through in regex... Unrecognized escape \E passed through in regex... Unrecognized escape \Q passed through in regex... Unrecognized escape \E passed through in regex... (?-xism:\Qtr??u++e\E|\Qfa+l?se\E)

    Update:

    There's another problem. The unordered keys of a hash are being used to build an ordered regex alternation, which always matches the first alternation match found even if there is a longer match in a subsequent alternation possibility. The problem may not bite in this application because the strings being searched (per the example in the OP) seem to be nonword-bounded words (and throwing in some  \b assertions would probably help, as Your Mother points out in Re: timtowtdi but which is the [quote fingers]best[/quote fingers]), but in another case (and if the longest match is, indeed, desired) it might:

    >perl -wMstrict -le "my %xlate = ( A => '1', AA => '2', AAA => '3', AAAA => '4', ); my $mooch = join '|', keys %xlate; $mooch = qr{ (?i) $mooch }xms; my $match = join '|', reverse sort keys %xlate; $match = qr{ (?i) $match }xms; print $mooch; print $match; ;; my $s = 'xAxAAxAAAxAAAAx'; print qq{'$s'}; $s =~ s{ ($mooch) }{$xlate{ uc $1 }}xmsg; print qq{'$s'}; ;; $s = 'xAxaaxAaAxaAaAx'; print qq{'$s'}; $s =~ s{ ($match) }{$xlate{ uc $1 }}xmsg; print qq{'$s'}; " (?msx-i: (?i) A|AA|AAAA|AAA ) (?msx-i: (?i) AAAA|AAA|AA|A ) 'xAxAAxAAAxAAAAx' 'x1x11x111x1111x' 'xAxaaxAaAxaAaAx' 'x1x2x3x4x'

    Since the OP seeems to be looking for longest words and word phrases, I think I would go for the  \b-elt-and-suspenders approach of something like:

    >perl -wMstrict -le "my %xlate = ( 'string a' => '3', 'string b' => '2', true => '1', false => '0', tt => '22', ttt => '33', ); my $xl = join '|', map { quotemeta } reverse sort keys %xlate; $xl = qr{ (?i) \b (?: $xl) \b }xms; ;; my $s = 'vv TTT ww TT String A xx sTrInG b xtruex True yy FALSE zz'; print qq{'$s'}; ;; $s =~ s{ ($xl) }{$xlate{ lc $1 }}xmsg; print qq{'$s'}; " 'vv TTT ww TT String A xx sTrInG b xtruex True yy FALSE zz' 'vv 33 ww 22 3 xx 2 xtruex 1 yy 0 zz'