chunlou has asked for the wisdom of the Perl Monks concerning the following question:

Came across the RegexpHash module one day. So, to try things out, I used it to tranlate words, from profanity to nicer ones.
use Tie::RegexpHash; my $text = <<TEXT; God damn you for your birth, you mummaphvker. I'm so PI\$\$\$\$ed, I hope you go to Hell. You have ruined my life. \$KRU you. I H\@TE you forever, you goddamn fvker TEXT print "Before:\n$text\n"; # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - my %profane; tie %profane, 'Tie::RegexpHash'; # profanity replacements $profane{qr/ruin/i} = 'lighten'; # ruin $profane{qr/hell/i} = 'Heaven'; # hell $profane{qr/(damn|darn)/i} = 'bless'; # damn $profane{qr/h(a|@)te/i} = 'love'; # hate $profane{qr/(f|ph)(u|v)(c*)(k*)er/i} = 'lover'; # f..ker $profane{qr/pi(s+|\$+)/i} = 'touch'; # piss $profane{qr/(s|\$)(c|k)r(ew|oo|u)/i} = 'Kiss'; # screw # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - my $profanity = join("|", keys(%profane)); $text =~ s/($profanity)/$profane{$1}/g; print "After:\n$text\n"; __END__ Before: God damn you for your birth, you mummaphvker. I'm so PI$$$$ed, I hope you go to Hell. You have ruined my life. $KRU you. I H@TE you forever, you goddamn fvker After: God bless you for your birth, you mummalover. I'm so touched, I hope you go to Heaven. You have lightened my life. Kiss you. I love you forever, you godbless lover
The profanity you match could be a regex pattern, but the word you replace it with can only be some predefined static string. So, if I want to replace "piss me off" or "piss us off" with "lift me up" or "lift us up", I would need to create two key/value pairs in RegexpHash, instead of something like /piss (.*) off/ and /lift $1 up/. Any short neat way to accomplish that?

More generally, any module or product helps do translation of that sort (which's basically a special case of tranlating between languauges, no?)?

Thanks.

Replies are listed 'Best First'.
Re: Turn Hate Mail to Love Letter: Regex Multi-Word/Phase Replace
by Skeeve (Parson) on Jun 25, 2003 at 08:18 UTC
    how about:
    : : $profane{qr/piss (us|you) off/i} = 'lift $1 off'; : : : : $text =~ s/($profanity)/eval '"'.$profane{$1}.'"'/ge; : :
    as a first try? Won't work with quotes in the replacement, but that's an easy one ;-)
      Good trick. But when s/($profanity)/eval '"'.$profane{$1}.'"'/ge turns to s/($profanity)/lift $1 off/g, doesn't it give you "life piss you off off" since "piss you off" is what being captured into $1?

      Consider this:
      $_ = "candies lift me up"; s/(cand(y|ies)|lift(s|) (me|us) up)/1 $1; 2 $2; 3 $3; 4 $4;\n/g; print ; __END__ 1 candies; 2 ies; 3 ; 4 ; 1 lift me up; 2 ; 3 ; 4 me;
      "me" is captured in $4. In other words, we don't know ahead of time which of the $1 .. $n "me" will end up in when the regex can have arbitrary many brackets.
        Good point regarding my example.

        I should have used $2 instead. But I don't think this is a problem. You just have to remember to add 1 to your $n.

        Otherwise you could go the $&-way:

        s/$profanity/eval '"'.$profane{$&}.'"'/ge;