in reply to Re: timtowtdi but which is the [quote fingers]best[/quote fingers]
in thread timtowtdi but which is the [quote fingers]best[/quote fingers]
my $match = "\Q" . join ("\E|\Q", keys %translate) . "\E";
There's a problem here. In interpolated strings, the \Q and \E escapes act as interpolation 'controls' (if that's the right term) and softly and silently vanish away. Except for raising a ruckus, using non-interpolating strings doesn't help matters:
>perl -wMstrict -le "my %translate = ( 'tr??u++e' => 1, 'fa+l?se' => 0, ); ;; my $match = qq{\Q} . join(qq{\E|\Q}, keys %translate) . qq{\E}; print qq{'$match'}; $match = qr{$match}; print $match; ;; my $mooch = '\Q' . join('\E|\Q', keys %translate) . '\E'; print qq{'$mooch'}; $mooch = qr{$mooch}; print $mooch; " 'tr??u++e|fa+l?se' (?-xism:tr??u++e|fa+l?se) '\Qtr??u++e\E|\Qfa+l?se\E' Unrecognized escape \Q passed through in regex... Unrecognized escape \E passed through in regex... Unrecognized escape \Q passed through in regex... Unrecognized escape \E passed through in regex... (?-xism:\Qtr??u++e\E|\Qfa+l?se\E)
Update:
There's another problem. The unordered keys of a hash are being used to build an ordered regex alternation, which always matches the first alternation match found even if there is a longer match in a subsequent alternation possibility. The problem may not bite in this application because the strings being searched (per the example in the OP) seem to be nonword-bounded words (and throwing in some \b assertions would probably help, as Your Mother points out in Re: timtowtdi but which is the [quote fingers]best[/quote fingers]), but in another case (and if the longest match is, indeed, desired) it might:
>perl -wMstrict -le "my %xlate = ( A => '1', AA => '2', AAA => '3', AAAA => '4', ); my $mooch = join '|', keys %xlate; $mooch = qr{ (?i) $mooch }xms; my $match = join '|', reverse sort keys %xlate; $match = qr{ (?i) $match }xms; print $mooch; print $match; ;; my $s = 'xAxAAxAAAxAAAAx'; print qq{'$s'}; $s =~ s{ ($mooch) }{$xlate{ uc $1 }}xmsg; print qq{'$s'}; ;; $s = 'xAxaaxAaAxaAaAx'; print qq{'$s'}; $s =~ s{ ($match) }{$xlate{ uc $1 }}xmsg; print qq{'$s'}; " (?msx-i: (?i) A|AA|AAAA|AAA ) (?msx-i: (?i) AAAA|AAA|AA|A ) 'xAxAAxAAAxAAAAx' 'x1x11x111x1111x' 'xAxaaxAaAxaAaAx' 'x1x2x3x4x'
Since the OP seeems to be looking for longest words and word phrases, I think I would go for the \b-elt-and-suspenders approach of something like:
>perl -wMstrict -le "my %xlate = ( 'string a' => '3', 'string b' => '2', true => '1', false => '0', tt => '22', ttt => '33', ); my $xl = join '|', map { quotemeta } reverse sort keys %xlate; $xl = qr{ (?i) \b (?: $xl) \b }xms; ;; my $s = 'vv TTT ww TT String A xx sTrInG b xtruex True yy FALSE zz'; print qq{'$s'}; ;; $s =~ s{ ($xl) }{$xlate{ lc $1 }}xmsg; print qq{'$s'}; " 'vv TTT ww TT String A xx sTrInG b xtruex True yy FALSE zz' 'vv 33 ww 22 3 xx 2 xtruex 1 yy 0 zz'
|
|---|