in reply to Regex solution needed

I don't have a regex solution for you just a couple of comments.
  1. Is there a reason you're not using the CPAN Profanity Modules ?
  2. On an ESPN board (obviously sports related conversations) I had the following phrase edited: "Well, you beat me this week." it became "Well, you &*%$#%^ this week".

I think the edited version made it sound like I said something much worse than I did!

Replies are listed 'Best First'.
Re^2: Regex solution needed
by spivey3587 (Acolyte) on Feb 23, 2007 at 17:08 UTC
    I did test the CPAN profanity modules so as not to re-invent the wheel, but the results for the data I was testing against threw a lot of false positives. I figured it may be trying to do too much so I went with my own design.

    Funny about your posting and obviously that's a problem with these sort of things. My module will only return true/false, so it's up to the calling program to decide what to do. I've tested with a LOT of real thread postings and have slimmed the dictionary down so that it's lenient. Testing for 'beat' as a vulgar term is ridiculous, IMHO.

Re^2: Regex solution needed
by ikegami (Patriarch) on Feb 23, 2007 at 20:08 UTC

    "Well, you beat me this week." it became "Well, you &*%$#%^ this week".

    Once possible profanities have been identified, censoring is just one option. Another would be to list the possible profanities to the user, warn him that profanities are not allowed on the board and allow him to proceed without further editing. This would allow the moderators to delete the post without further warning. On a board that supports moderation, the post could even be withheld until approved by the moderators.

      Once possible profanities have been identified, censoring is just one option.
      This is getting political / philosophical, but I tend to say that censoring is never an option!

      CountZero

      "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re^2: Regex solution needed
by dwhite20899 (Friar) on Feb 23, 2007 at 19:32 UTC