in reply to Re: Re:(2) newkid confused
in thread newkid confused: reverse then unreverse a string

It would be nice to have a standard-issue CPAN module for checking so-called "cusswords", perhaps in the Lingua branch, as in, Lingua::Foul or what have you.

Just like rolling your own HTML parser or CGI handler is such a waste of time, this sort of thing is done all the time, and probably done badly too. Though "badly" is not necessarily because of bad programming, but because of not having thought it all the way through.

What I mean specifically is that a group of people, from different backgrounds, will certainly come up with a better module than a single person.

There are many words which are not foul, but are constantly picked up by filters which are too paranoid. As an example, consider "shittake mushrooms" or even "gratitude", which I have seen banned because it contains "tit". Heck, there are hockey players named Satan ("Sha-tan" presumably) and a town in Austria named Fucking ("Foo-king").

The chile system used in Eudora is interesting, because it does not block messages, but warns about the content of them based on triggers. A good module would have to do the same, perhaps giving an indication of the "spicy" nature of the text.

A robust detection system could then be extended to star-out, omit, replace, or otherwise handle any foul language, but ultimately at the discretion of the programmer and perhaps any user settings.

I'm sure there are a lot of people who would likely want to use a module like this, but have no intention of writing one.

Replies are listed 'Best First'.
Re: Re^3: Cusswords
by belg4mit (Prior) on Apr 06, 2002 at 00:19 UTC
    TheDamian to the rescue Regexp::Common. Look for "profanity". Of course it's for EN.

    --
    perl -pe "s/\b;([st])/'\1/mg"