in reply to Improving regular expression to remove stopwords

Hi IB2017

You may want to look into a so called 'lookahead' or 'lookbehind'. In the following example I use a 'Negative Lookahead' by specifying (?!\-)

use strict ; use warnings ; my @stopwords = qw{ sous } ; my $string = "sous sous-alimentation" ; my ($rx) = map qr/(?:$_)/, join "|", map qr/\b\Q$_\E\b(?!\-)/, @stopwo +rds; $string =~ s/$rx//g ; print $string ; __END__ sous-alimentation

Veltro

edit: Link: Extended Patterns