in reply to Improving regular expression to remove stopwords
A negative look-ahead would meet that particular requirement. SSCCE:
use strict; use warnings; use Test::More; my @stopwords = qw/foo sous bar/; my ($rx) = map qr/(?:$_)/, join "|", map qr/\b\Q$_\E\b(?!-)/, @stopwor +ds; my @stop = ( 'foo is good', 'so is sous' ); my @go = ( 'sous-alimentation', ); plan tests => @stop + @go; for my $str (@stop) { like ($str, $rx, "$str matched"); } for my $str (@go) { unlike ($str, $rx, "$str not matched"); }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Improving regular expression to remove stopwords
by IB2017 (Pilgrim) on Jan 09, 2019 at 13:18 UTC |