The only workable idea I've known about for filtering content for profanity and vulgarity is none other than a pair of human eyes.
Beware trying to automate any solution to the problems. Humans are simply too creative, and human language is simply too ambiguous.
I think your best bet is to monitor what your script considers to be the worst potential offenders, then have a pair of human eyeballs peruse the transcripts and take action when necessary.
I speak from experience: A dot-com I worked for tried to automatically filter user content on a site, and it monumentally backfired. There were so many embarrassing and problematic false positives that the filters were eventually removed.
(I should point out that the site wasn't even the kind one would think would generate such problems: It was a recreational sports site! As it turns out, alot of people like to include double entendres or clever wordplay when naming their sports teams ;-)
Hope this helps!
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.