in reply to Re: Re: (nrd) Cool Uses For Perl? AIM-BOTS!
in thread Cool Uses For Perl? AIM-BOTS!

I was out at a birthday party for a friend of my finacee this evening, so I didn't have much time to work on this.

If you take this code (See code below) and add it to your bot - call the "grabMarkovPhrase" and it will return a string for you to return to the user talking to your bot. That way instead of just random text, it will return text that is more in the style of... well, whatever text that you train it on.
(you could log all of the responses that users give your bots into a file, then you could train this bot against that file - that way it would talk back to them in a manner similar to the way they talk to it... that would likely degrade since there will be a lot of mimicry that it will learn on)
The variables at the top control the minimum and maximum size of the string that it will return (counting "words" - really it just splits on a space and then looks to see how many array spots that creates).
I just real quickly tore this code out of my radiohead song generator stuff and also created just a flat file of all of the radiohead text.

There are a ton of things that could make this far better:

Some examples of running this a few times:
  1. White-washed me alive we've gotta dig it up on the main
  2. Immerse I'd telling to shreds. You find another mess
  3. Gave me a minute the underground. Let down Why don't really since in a jack the best you ca
  4. Bill My uncle Bill to try Oh, it's OK And why should turn animals Living in the beach wi
  5. And he's see me a message I can't afford to
  6. Karma police arrest when I got something My
  7. A Kid A Kinetic Kinetic Kinetic Kinetic Kinetic Kinetic Kinetic... oh no, pop it died a bag
  8. IA, the world astral carefully Burning in a prises, I wish I could be and it hard, t

(See code below)
#!/usr/bin/perl -w my $MAX_STRING_LENGTH = 20;#what is the max number of words you want t +o return my $MIN_STRING_LENGTH = 3;#what is the min number of words you want to + return #seems to look "best" (still not terriffic) when the max is 20 and the + min is 3 #yee ol global hash my %dataHash; ########################### #pass in file name and then put its text in a string - stripping out a +ll of the newlines #looks at 'words' of size 4 at a time ########################### sub createQuadHash{ my $fileName = shift(@_); my $allText = ""; open(PHRASES, $fileName) or die "Can't open that file: $!\n"; while(<PHRASES>){ $allText .= $_; } close(PHRASES); $allText =~ tr/\n/ /d;#replace the newlines with spaces (just chom +ping sometimes results in words mashed together) $allText =~ tr/[]()\"//d;#strip out chars that we don't want (beyo +nd just newlines) #now take $allText and split it into an array @allTextArr = split('', $allText); #iterate the array and fill a hash, looking at the $_ and the one +to the right for(my $x = 0; $x < $#allTextArr; $x++){ #make sure that the $pos2 isn't gonna be out of the array + if($x+7 <= $#allTextArr){ $pos1 = $allTextArr[$x].$allTextArr[$x+1].$allTextArr[$x+2 +].$allTextArr[$x+3]; $pos2 = $allTextArr[$x+4].$allTextArr[$x+5].$allTextArr[$x ++6].$allTextArr[$x+7]; $dataHash{$pos1}{$pos2}++; } } return; } ########################### ########################### #from Perl Cookbook 2.10 #take in a hash - convert the values to percentages, and then return t +hat hash ########################### sub weight_to_dist { my %weights = @_; my %dist = (); my $total = 0; my ($key, $weight); local $_; foreach (values %weights) { $total += $_; } while ( ($key, $weight) = each %weights ) { $dist{$key} = $weight/$total; } return %dist; } ########################### ########################### #from Perl Cookbook 2.10 #take in a hash and via a weighted random, pick one of the keys based +on the value and return it ########################### sub weighted_rand { my %dist = @_; my ($key, $weight); my $rand; while (1) { # to avoid floating point inaccura +cies $rand = rand; while ( ($key, $weight) = each %dist ) { return $key if ($rand -= $weight) < 0; } } } ########################### ########################### #randomly pick and return a starting letter ########################### sub startLetter{ my $alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'; my $startLetter = substr($alphabet, int(rand(26)), 1); return $startLetter; } ########################### ########################### #randomly pick a key from the songsHash ########################### sub randomKeyFromFirstHash{ return ((keys %dataHash)[int rand keys %dataHash]); } ########################### ########################### #this returns the phrase to show to the screen/user ########################### sub grabMarkovPhrase { createQuadHash('radiohead.txt'); my $s = ""; my $start = ""; my $wordCount = 0; my %tempHash; my @tempArr; my $retStr = ""; #randomly pick a phrase length - between 1 and the MAX_STRING_LENG +TH my $stoppinTime = int((rand($MAX_STRING_LENGTH-$MIN_STRING_LENGTH) + + $MIN_STRING_LENGTH)); #randomly pick a spot in the hash, then check to see if the first +letter is a cap letter #if it is, then carry on, if not, pick again my $fail = 0; while($fail != 1){ $start = randomKeyFromFirstHash();#will be three chars #if the first position is a capital letter, then cool, (ord wi +ll only look at the first) if(ord($start) > 64 && ord($start) < 91){ $fail = 1; } } #take the start letter and get rolling... $s = $start; $retStr .= $s; #count the letters followed by spaces while($wordCount < $stoppinTime){ %tempHash = weight_to_dist(%{$dataHash{$s}}); $s = weighted_rand(%tempHash); $retStr .= $s; #count the number of words, split on the space @tempArr = split(' ',$retStr); $wordCount = scalar(@tempArr); } return $retStr; } ########################### print grabMarkovPhrase();