Melly has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monkees

I ended up writing two regexes when I'm sure one would do - can anyone compress the following to a single regex?

($foo =~ /$users_regex/ and $foo !~ /,v$/)

$users_regex can be any valid regex. We can ignore the edge-case of the user's regex wanting to match .*,v$

Tom Melly, tom@tomandlu.co.uk

Replies are listed 'Best First'.
Re: RegEx - match pattern not followed by literal
by Fletch (Bishop) on Oct 12, 2006 at 15:27 UTC
    Do you hear that, Mr. Anderson? That's the sound of premature optimization . . .

    The overhead in it being two regexen is so likely trivial as to not be worth the effort. Not to mention you could probably get more of a boost by rearranging the cases so the trivial presence of /,v$/ is checked first before bothering to go on to the user's regex.

      Well, it wasn't really a question about optimisation (in terms of speed) - more about 'neatness'.

      That said, good point about testing for ,v prior to the user's regex.

      Hey! Did you down-vote me? Wah! This means war...

      Tom Melly, tom@tomandlu.co.uk

        I downvote everyone. Even myself.

        Well, it wasn't really a question about optimisation (in terms of speed) - more about 'neatness'.

        Neatness is in the eye of the beholder. General rule is: if it fits naturally in two separate matches then chances are that indeed it may be done in one, but at the expense of neatness, and vice versa. I say so because it happens all the time to read about people who "want to do it in just one match" since they think it's neater. Most times it plainly won't be.

Re: RegEx - match pattern not followed by literal
by chargrill (Parson) on Oct 12, 2006 at 15:29 UTC

    Perhaps using a negative lookahead?

    $foo =~ /$users_regex(?!,v$)/

    From perldoc perlre:

    "(?!pattern)"

    A zero-width negative look-ahead assertion. For example "/foo(?!bar)/" matches any occurrence of "foo" that isn't followed by "bar". Note however that look-ahead and look- behind are NOT the same thing. You cannot use this for look-behind.

    If you are looking for a "bar" that isn't preceded by a "foo", "/(?!foo)bar/" will not do what you want. That's because the "(?!foo)" is just saying that the next thing can- not be "foo"--and it's not, it's a "bar", so "foobar" will match. You would have to do something like "/(?!foo)...bar/" for that. We say "like" because there's the case of your "bar" not having three characters before it. You could cover that this way: "/(?:(?!foo)...|^.{0,2})bar/". Sometimes it's still easier just to say:

    if (/bar/ && $` !~ /foo$/)

    Update: Fixed formatting, and adding a comment that the documentation does suggest, as Fletch suggests, that it's sometimes still easier to do what Melly is originally doing.



    --chargrill
    s**lil*; $*=join'',sort split q**; s;.*;grr; &&s+(.(.)).+$2$1+; $; = qq-$_-;s,.*,ahc,;$,.=chop for split q,,,reverse;print for($,,$;,$*,$/)
      No, that doesn't work.
      $users_regexp = qr/^./s; 'abc,v' = /$users_regex(?!,v$)/; # Matches, but it shouldn't. $users_regexp = qr/^.*$/s; 'abc,v' = /$users_regex(?!,v$)/; # Matches, but it shouldn't. $users_regexp = qr/,/; 'abc,v' = /$users_regex(?!,v$)/; # Matches, but it shouldn't.

      Solution:

      /^(?=.*?$users_regex)(?!.*?,v$)/s
      which simplifies to
      /^(?!.*,v$).*?$users_regex/s

      Of course, this is not nearly as readable as the original.

      Updated to show more examples where the parent doesn't work.

        Heh - yeah, looks like the single-regex solution is just too darned ugly (and I'd feel guilty handing that on to be maintained).

        I'll stick with my original two regex solution (but implement that bastard Fletch's suggestion to swap the regexes and check for ,v$ first)

        I guess I was just suprised that the reverse, /$users_regex,v$/, was so easy, but handling NOT ,v$ is so tricky...

        Tom Melly, tom@tomandlu.co.uk
      I think your negative look-ahead needs to be something like (?!.*,v$) because you don't know where the user-supplied regex is going to leave off matching. I agree with Fletch, however, as the optimisation game is probably not worth the candle.

      Cheers,

      JohnGG

        I don't think a negative look ahead will work - .*(?!,v) will still match foo,v since the .* can just suck up the ,v

        Tom Melly, tom@tomandlu.co.uk