in reply to Re: Word Exclusion Regex (was Re: regex problem)
in thread regex problem

Oops, the original version used join() when creating $first. I don't know why I changed it. As for the other complaint, the regex is designed to ensure the words don't appear at all. If you only wanted a regex that didn't match a string that is a set of words, it would look much simpler: /^(?!(?:cat|dog|pig)$)/. That's not what I was going for.

_____________________________________________________
Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

Replies are listed 'Best First'.
Re: Re: Re: Word Exclusion Regex (was Re: regex problem)
by blakem (Monsignor) on Feb 10, 2002 at 17:41 UTC
    whoops... my bad. must have munged the regex myself somehow...

    Pardon my conceit, as I don't mean to contradict this captivating regex, but I purport that its still not correct.... all of which happen to get incorrectly excluded for exclude('dog','cat','pig'): ;-P

    (?-xism:^[^pcd]*(?:(?:p(?!ig)|c(?!at)|d(?!og)))*[^pcd]*$) dog => cat => pig => owl => 1 conceit => contradict => captivating => purport => correct =>

    -Blake
    p.s. List obtained using:

    $ perl -lne 'print if /^[dpc].*[dpc]/ && !/dog|cat|pig/' /usr/dict/wor +ds
      Oh, and since you mentioned some intrigue as to the function of the regex, here's what it does:
      1. It matches as many letters as it can that don't start one of the forbidden words.
      2. Then it matches one of those letters, so long as it isn't followed by the rest of the word.
      3. Then it matches as many non-bad letters as it can.
      4. Go to step 2 if you can.
      Friedl would call this "unrolling the loop".

      _____________________________________________________
      Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
      s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

      Um, you're using the wrong regex. My function returns ^[^pcd]*(?:(?:p(?!ig)|c(?!at)|d(?!og))[^pcd]*)*$ whereas you are using ^[^pcd]*(?:(?:p(?!ig)|c(?!at)|d(?!og)))*[^pcd]*$ The [^pcd]* got moved outside the (?:...) somehow. When I use the right regex, I get the right results.

      _____________________________________________________
      Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
      s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

      I don't know where you got that regex from, but it's certainly not from japhy's code. The regex his code produces is: (?-xism:^[^pcd]*(?:(?:p(?!ig)|c(?!at)|d(?!og))[^p c d]*)*$) (That's also the regex you used in your earlier response.) The regex you used in your latest response has a parenthesis in the wrong place and is missing a quantifier; of couse it doesn't work!