in reply to Re: regex for swear filter
in thread regex for swear filter

How are you going to discuss the comparative advantages and disadvantages of various beasts of burden with a filter like that?

Replies are listed 'Best First'.
Re: Re: Re: regex for swear filter
by halley (Prior) on Feb 13, 2004 at 12:46 UTC
    When forced by a prudish management to solve a similar problem, I assigned a point system. Each regex that applied would add or subtract points. Only those matches passing a point threshold would be scrubbed.

    For example, it's more likely to be an intentional curse if it's at the beginning or ending of a word. It's more likely to be an intentional curse if it is the whole word (word boundaries on both ends). It's less likely if it appears buried in a word; these are not filtered, much to the relief of residents of Scunthorp.

    --
    [ e d @ h a l l e y . c c ]

      What ever filter you make, it's easy to circumvent. Just witness all the "spam" and "nanny" filters, that block emails or websites discussing breast cancer, or mentioning non-body parts like 'ass' and 'nipple', but allow texts mentioning 'V-I-A-G-R-A', 'H*T T!T$' or '\/\/3+ p|_|zz!35'.

      Swear filters are a technical solution to a social problem. Techinal solutions to social problems usually don't work, and have bad side effects.

      Abigail

        I thoroughly agree. That doesn't change the fact that many managers *don't* agree, and writing code to change "foo" to "*$@" will sometimes pay the rent for you and your children. I'm glad I convinced my managers to let the *recipient* choose the filter settings, and not try to impose their sensitivities on everyone else.

        The scheme I wrote was pretty effective at finding creative alternative glyphic forms, like $.h.1.+. Some kids played with the boundaries of what the filter could and could not do, but the average everyday slips of decorum were found and scrubbed.

        For the most part, it solved the problem put to management: that casual swearing was filtered if individual users wanted it to be filtered.

        --
        [ e d @ h a l l e y . c c ]