in reply to broken regex

(Please read Writeup Formatting Tips and format your posts so they are easier to read)

It's hard (maybe impossible) to know what the correct regex is without knowing the format of the data that you are dealing with.

--
<http://dave.org.uk>

"The first rule of Perl club is you do not talk about Perl club."
-- Chip Salzenberg

Replies are listed 'Best First'.
Re^2: broken regex
by elwood (Initiate) on Sep 15, 2006 at 12:48 UTC
    The data in $line is a simple sentance.
    The @words array is a list of individual words, one per entry, to remove. i.e. $line="the cat sat on the mat"
    and the @words might contain
    "on"
    "the"

      Ok. Now we're getting somewhere. In <code> tags, your code looks like this:

      foreach (@words) { $line =~ s/([%\d])$_([%\d])//g; print "$line\n"; }

      So the regex is saying this:

      Look for either a percent sign or a digit (which will be captured in $1) followed by the string in $_ followed by another percent sign or a digit (which will be captured in $2)

      And if all that is found then it's replaced by an empty string.

      So as your data doesn't appear to contain any digits or percent signs, then none of that is ever going to match. I think that what you actually wanted was far simpler.

      foreach (@words) { $line =~ s/\b$_\b//g; print "$line\n"; }

      And I think you may want to move the print statement out of the loop - tho' it might be there for debugging purposes.

      See perlretut for a good introduction to regexes.

      --
      <http://dave.org.uk>

      "The first rule of Perl club is you do not talk about Perl club."
      -- Chip Salzenberg

      Then you probably want your regex to be something more like s/\b$_\b//g which will match word boundaries. Just to make sure that there's nothing more that you need to do than is coming across in your message, what were you aiming to do with the %\d?

      Hays