Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I want to create a word filter, and I want to store all the words in a database somewhere, probably just some text file. I'm smart enough to get the database from a file into an array, but what I want to do is then to use that array in my regular expressions...

say I have an array of bad words, but I'll make it more simple, we have a string which contains a bad word...
$badword = "badword";
now I want to test my string, $str, for any instinces of $badword. But I don't just want to leave it at that, I want to make sure I catch the silly little "b*a*d w&or-d" instinces. How would I check $str, for $badword with any non word characters between each character.... Basicly I want do do the following, but by using a string rather than literals....
$str =~ /(\bb\W*a\W*d\W*w\W*o\W*r\W*d\b)|badword/;
how could I do that with an array of strings?

Replies are listed 'Best First'.
Re: using a string in a regexp
by frag (Hermit) on Jan 04, 2002 at 06:12 UTC
    Bowdlerization, huh?
    You can split each word into individual characters, rejoin them with \W*, and then convert each word into a regex, that you then use in your s///.
    @badwords = map { my $regex = join '\W*', (split //, $_); qr/$regex/i; } @badwords; foreach (@badwords) { $string =~ s/$_/<expletive>/gc; }

    -- Frag.
    --
    "Just remember what ol' Jack Burton does when the earth quakes, the poison arrows fall from the sky, and the pillars of Heaven shake. Yeah, Jack Burton just looks that big old storm right in the eye and says, "Give me your best shot. I can take it."

Re: using a string in a regexp
by gav^ (Curate) on Jan 04, 2002 at 08:10 UTC
    If you were looking to identify/remove profanity, Regexp::Common has a couple of cunning regexps. It will probably catch a bunch of stuff you didn't think of too.