Following talk in the CB today, I started poking through the latest XML ticker, which shows the list of all monks at the monastery. I was struck by how many of the usernames were similar to real words. Combining that with my wish to play with Text::Soundex gave me..
The wonderful toy of Monastery::Monkify. It changes every word into a link to the home node of a monk with a similar name. Unfortunatly, Text::Soundex isn't picky enough to get really good matches, but it's still fun. Anyways, the code:
use strict; use Text::Soundex; use LWP::Simple; use HTML::Entities; $Text::Soundex::nocode = 'Z000'; $| = 1; print "Loading.."; my @nodes = map {($_)=/"([^"]+)"/;defined($_)?[$_,soundex($_)=~/(.)(.{ +3})/]:['','Z',0]} split /\n/, get 'http://perlmonks.org/?node_id=74291'; print "Done.\n"; s/(\w+)/bestfit($1)/ge, print while <>; sub bestfit { my($word) = @_; my @se = soundex($word)=~ /(.)(.{3})/; return "[nodereaper|$word]" if join('',@se) eq 'Z000'; my @found = sort {abs($se[1]-$a->[2]) <=> $se[1]-$b->[2]} grep {$se[ +0] eq $_->[1]} @nodes; return "[".decode_entities $found[rand(grep {$found[0][2] == $_->[2] +} @found)]->[0]."|$word]"; }
Note: I'm doing a somewhat evil thing here, by processing XML using regexes. I'm forced to do this, however, because the "XML" node in question isn't actually XML -- it has non-XML entities. Ah, well..
perl -e 'print "I love $^X$\"$]!$/"#$&V"+@( NO CARRIER'
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Monastery::Monkify
by Albannach (Monsignor) on Apr 21, 2001 at 21:07 UTC | |
(ar0n) Re: Monastery::Monkify
by ar0n (Priest) on Apr 21, 2001 at 21:25 UTC |