comment on

After reading about grep's efforts at yoda-speak, I got inspired to try doing a translator for another certain CGI character with distinct language patterns. The result isn't so much a program as a regular expression grinder that breaks rather easily. It works best for simple present-tense first-person statements, which are mostly what gollum mutters anyway.

I welcome any comments or suggestions on this. I would like nothing better if someone improved on this and shared it back. Some possible enhancements might be: using a module to parse the grammar, modifying individual words one at a time so the global substitutions don't trip over each other (the order of the regular expressions is very deliberate), adding more gollum-words to the lists for the randomizer to choose from...yeah with something this primitive there are all sorts of possibilities.

Some examples to put through it:
I hate you forever
Do you like green eggs and ham
I want it back
Frodo is nice, not like Sam

#!/usr/bin/perl -w

use strict;

while (<>) {
  chomp;

  #call Sam 'the ____ hobbit'
  my @sam_list = qw(nasty rude mean);
  my $sam_descr = $sam_list[int rand scalar(@sam_list)];
  s/\bSam\b/the $sam_descr hobbit/ig;

  #replace 's' at word's end with 'ses'
  s/([a-rt-z]{3,})s(\b)/$1ses$2/ig;

  #replace 's' at word's start or middle with 'sss'
  s/(\b\w*)[sS]([a-df-z]+\b)/$1sss$2/ig;
  s/\bme\b/us/ig;

  #call Frodo 'Master'
  s/\bFrodo\b/Master/ig;

  #replace second-person words with third-person 'it'
  s/your/its/ig;
  s/\byou\b/it/ig;
  s/\bare\b/is/ig;
  s/have/has/ig;

  #replace past-tense words with present-tense
  s/was/is/ig;
  s/(\b\w{3,})ed(\b)/$1s$2/ig;

  #replace 'i <verb>' with 'i <verb>s' naively
  s/^i\s(\w{4,})\b/I $1s/ig;
  s/\bdo\b/does/ig;

  #replace first-person words with third-person
  my @i_list = qw(we smeagol);
  my $i_descr = $i_list[int rand scalar(@i_list)];
  s/\bi\b/$i_descr/ig;

  #stick a generic ending on the end (no if negatives)
  my $ending = "";
  $ending = "no" if (m/(not)|(no)/);
  my @endings = ("","yess","my precioussss", "gollum");
  $ending = $endings [int rand scalar(@endings)] unless ($ending eq "n
+o");
  $_ .= " $ending";

  #capitalize the sentence
  $_ = ucfirst($_);

  print $_, "\n";
}
[download]

20030524 Edit by Corion: Changed absolute link into [id://...] style.

In reply to gollum speak translator by Anonymous Monk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.