In my oreillynet blog today, I call for people to stop working on content-based spam filtering, since it's an unwinnable arms race. As an example, I got a spam today that led off with:
., ,; .R, @FS fUD jos DN Gw, Fzw OUn hdx DLdknFf: qgOKPugU aYkIda @ygoaQr Dj hN Sam xb tJ. mBT. fSV zek Nw; @Hf dxd Stk ALQ TZFwKw: qR ol HJb EmpiiA@ sb .Vz XWw chY:: Aw, ju iA GFk aHs,c woi FsrQua Gcc pW kA IBy HFd ZVx Gsx SME ziyA riA UNvhcHbgj NZaBdunU TYA NsaQfMzrRB , ,:;U : Ae , ,;w .: lze yrP IegDp.
Since that's filled with obvious non-words, I demonstrated how easy it would be to replace those with words or names.
open( my $fh, '/usr/share/dict/propernames' ) or die $!; while (<$fh>) { chomp; push( @{$words{length($_)}}, $_ ); } while (<DATA>) { s/(\S+)/replace($1)/ge; print; } sub replace { my $list = $words{length $_[0]} or return $_[0]; return $list->[rand @$list]; } __DATA__ ., ,; .r, @ln qly tlg nq aq, Brg iaB WiW iqpbduk: ifcciWvj Wypdip @rnoqqS lc st unx mm su. Wyl. eee daa jb; @kS kjt smp WkW 8hytct: ih xd WiZ Zlantc@ tg .vk WrW cyW:: hy, vx bo WnW gtx,i 0rW SnjsaS WbW gw oo kkZ rto WeW fvB 0qZ xbcd ocg tfrotxynk veqWhurb kdy wavkuseax0 , ,:;i : yr , ,;i .: Zjc ugr btfau.
which when run gives:
Ti Po Kaj Tao Wes Art Al Ian Jem Tao Raj Caroline Jeanette Harold Bradley Al Ji Stu Ro Hon Axel Kaj Tim Sam Stu Lee Tad Raj Phiroze Ed Ro Lin Shankar Hy Lex Ric Barry Van No Ji Jim Jerry Ram Sorrel Luc Ji Ji Kaj Van Mah Fay Art Hohn Ami Krzysztof Jennifer Jan Novorolsky , Saul : Ed , Per Ro Rob Bob Amedeo
Of course, you can use any list of words you like: /usr/share/dict/propernames just gives nicer results than /usr/share/dict/words did for this example.

xoxo,
Andy


In reply to A handy use for /usr/share/dict/words by petdance

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.