Came across the RegexpHash module one day. So, to try things out, I used it to tranlate words, from profanity to nicer ones.
use Tie::RegexpHash;
my $text = <<TEXT;
God damn you for your birth, you mummaphvker.
I'm so PI\$\$\$\$ed, I hope you go to Hell.
You have ruined my life. \$KRU you.
I H\@TE you forever, you goddamn fvker
TEXT
print "Before:\n$text\n";
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
my %profane;
tie %profane, 'Tie::RegexpHash';
# profanity replacements
$profane{qr/ruin/i} = 'lighten'; # ruin
$profane{qr/hell/i} = 'Heaven'; # hell
$profane{qr/(damn|darn)/i} = 'bless'; # damn
$profane{qr/h(a|@)te/i} = 'love'; # hate
$profane{qr/(f|ph)(u|v)(c*)(k*)er/i} = 'lover'; # f..ker
$profane{qr/pi(s+|\$+)/i} = 'touch'; # piss
$profane{qr/(s|\$)(c|k)r(ew|oo|u)/i} = 'Kiss'; # screw
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
my $profanity = join("|", keys(%profane));
$text =~ s/($profanity)/$profane{$1}/g;
print "After:\n$text\n";
__END__
Before:
God damn you for your birth, you mummaphvker.
I'm so PI$$$$ed, I hope you go to Hell.
You have ruined my life. $KRU you.
I H@TE you forever, you goddamn fvker
After:
God bless you for your birth, you mummalover.
I'm so touched, I hope you go to Heaven.
You have lightened my life. Kiss you.
I love you forever, you godbless lover
The profanity you match could be a regex pattern, but the word you replace it with can only be some predefined static string. So, if I want to replace "piss me off" or "piss us off" with "lift me up" or "lift us up", I would need to create two key/value pairs in RegexpHash, instead of something like
/piss (.*) off/ and
/lift $1 up/. Any short neat way to accomplish that?
More generally, any module or product helps do translation of that sort (which's basically a special case of tranlating between languauges, no?)?
Thanks.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.