When forced by a prudish management to solve a similar problem, I assigned a point system. Each regex that applied would add or subtract points. Only those matches passing a point threshold would be scrubbed.
For example, it's more likely to be an intentional curse if it's at the beginning or ending of a word. It's more likely to be an intentional curse if it is the whole word (word boundaries on both ends). It's less likely if it appears buried in a word; these are not filtered, much to the relief of residents of Scunthorp.
-- [ e d @ h a l l e y . c c ]
| [reply] |
| [reply] |
I thoroughly agree. That doesn't change the fact that many managers *don't* agree, and writing code to change "foo" to "*$@" will sometimes pay the rent for you and your children. I'm glad I convinced my managers to let the *recipient* choose the filter settings, and not try to impose their sensitivities on everyone else.
The scheme I wrote was pretty effective at finding creative alternative glyphic forms, like $.h.1.+. Some kids played with the boundaries of what the filter could and could not do, but the average everyday slips of decorum were found and scrubbed.
For the most part, it solved the problem put to management: that casual swearing was filtered if individual users wanted it to be filtered.
-- [ e d @ h a l l e y . c c ]
| [reply] |