in reply to Using a variable as pattern in substitution

You could do it with the loop, as you've hinted:

foreach my $stopword (@stop_words) { $date =~ s/\b$stopword\b//g; }
Another possibility is to combine the words into a single regex, and apply it once (note, this is untested):
my $stop_pat = '\b(' . join('|', @stop_words) . ')\b'; $data =~ s/$stop_pat//g;
Whether that will be faster depends a lot on the number of words, their lengths, and the length of the input string. Give each a try. You may also want to see if using the study function improves things any.

HTH

Replies are listed 'Best First'.
Re: Re: Using a variable as pattern in substitution
by Popcorn Dave (Abbot) on Jun 05, 2002 at 18:13 UTC
    Wouldn't it be easier or quicker to do something along the lines of:

    @array2 = grep {$_ =~ s/\b$stopword\b//g } @array1;

    or does grep compare favorably to a foreach as far as efficiency?

    Some people fall from grace. I prefer a running start...

    UPDATE:

    After thinking on this a bit, I think this solution would work for the problem. Fellow monks, please point out my folly if I'm wrong with this.

    #!/usr/bin/perl use strict; my @list = qw( red blue orange yellow black brown green ); print join( ' ', @list ); print "\n"; @list = grep { m/e/io } @list; print join( ' ', @list ); print "\n";

    I realize my regex isn't what the original poster was trying to do, but the idea is the same I believe.

    This is what I was trying to say in response earlier but I seem to have had a brain fart this morning.

      Wouldn't it be easier or quicker to ...
      Maybe or maybe not, but it would also not do what the original is asking for. This is looping over the input data, not over the patterns. It's also generating a second array instead of doing it in-place. Regardless of speed, that's guaranteed to need more memory.

Re: Re: Using a variable as pattern in substitution
by stew (Scribe) on Jun 05, 2002 at 15:46 UTC
    Thanks for that, one thing when I first tried it it didn't work. My stop words were in a text file, one per line I had to

     chop $stopwords

    to get it to work.

    One more thing what does the \b mean?

    Stew

      You can chomp every element in the array when you slurp the file: chomp(@stopwords = <FILE>);(As a side note, in general, chomp is preferable to chop.)

      The \b says to look for a word boundary. In other words, if your target string is part of another word, the \b will keep it from matching.

      my $target = "another"; $target =~ /other/; # this matches $target =~ /\bother/; # this doesn't
      You can read more about it in the perlre document.

      Avoid chop unless you want to remove the last character from every line of your file. Are you sure the last one ends with a newline? If not, oops.. there goes a character of data. chomp is not so indiscriminate.

      Makeshifts last the longest.