in reply to RegEx to match at least one non-adjacent term
Normally, you'd be able to use \b.
my $whitespace = qr/[\s()]+/; my $badwords = qr/.../i; my $wordchar = qr/[a-zA-Z]/; s/ $whitespace? \b $badwords \b $whitespace? / /xg;
But since you want to allow "12345Red6789", you'll have to implement your own version of \b.
my $whitespace = qr/[\s()]+/; my $badwords = qr/.../i; my $wordchar = qr/[a-zA-Z]/; s/ $whitespace? (?<! $wordchar ) # At start of word. $badwords # Words to erase. (?! $wordchar ) # At end of word. $whitespace? / /xg; # Avoid joining two numbers.
By the way, Regexp::List can build an efficient $badwords.
use Regexp::List qw( ); my @badwords = qw( r rd red ); my $badwords = Regexp::List->new(modifiers=>'i')->list2re(@badwords); # qr/r(?:e?d)?/i
Update: Added Regexp::List bit.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: RegEx to match at least one non-adjacent term
by Cefu (Beadle) on Dec 07, 2007 at 17:00 UTC | |
by ikegami (Patriarch) on Dec 07, 2007 at 17:09 UTC |