in reply to Re: regex that simulates boolean logic
in thread regex that simulates boolean logic

Assuming that the regex engine accepts Perl5 regex extensions, this may do what you need it to.

/^(?!7000$|7777$|7778$|3886200$|2200$|8488$|3406$|9100$|29389988$|7688 +$|5000$|20$|3408$|3404$|7648$)\d+$/
It seems to work for all the testcases in my post above plus several more I threw in.

Okay you lot, get your wings on the left, halos on the right. It's one size fits all, and "No!", you can't have a different color.
Pick up your cloud down the end and "Yes" if you get allocated a grey one they are a bit damp under foot, but someone has to get them.
Get used to the wings fast cos its an 8 hour day...unless the Govenor calls for a cyclone or hurricane, in which case 16 hour shifts are mandatory.
Just be grateful that you arrived just as the tornado season finished. Them buggers are real work.

Replies are listed 'Best First'.
Re: Re: Re: regex that simulates boolean logic
by Anonymous Monk on Nov 28, 2002 at 03:01 UTC

    Thanks for the reply, BrowserUk.. yes, it fixes the cases that didnt work before.. (it was failing on 70000, 77000 and on 207000), those passed.. I'll collect as many more cases as I can from the logfile and throw them at my test script during the course of the day..

    in the interests of learning a bit more about how this works, could you please check if my decoding of the regex is correct.. ?

    What you're doing is anchoring to the start of a line (^), then you got the negation (?!), then you're doing OR based matches, with one crucial difference from what I was doing, you're anchoring EACH of the matches to the end of line char ($).. and then finally 1 or more digit chars are a positive match (\d+$)..

    Compared to my effort,

    /^(?!.*7000|7777|7778|3886200|2200|8488|3406|9100|29389988|7688|5000|2 +0|3408|3404|7648).*[0-9]+/)
    I didn't need the 0-9+, I could have simply used \d+ instead, I didnt need a .* before matching the trailing digits.. but err, one question, how does this work without a \d+ or a .* match in front ?

    Oh, and one more question, please.. where do I learn more about all this ? I have a browser window opened to the perlre manpage, and another to the CD bookshelf, are there any other references on this sort of thing ? (I have a copy of Jeffery Friedls regex book, but it sorta whizzed over my head) :(

      This may clarify things a little (or not:).

      my $fail = qr/^ # Start looking at the begining of the string (?! # and immediately fail if you find 7000$ # "7000" then the end of the string # (ie. the whole string = "7000") | 7777$ # or "7777" | 7778$ # etc. | 3886200$ | 2200$ | 8488$ | 3406$ | 9100$ | 29389988$ | 7688$ | 5000$ | 20$ | 3408$ | 3404$ | 7648$ ) # If you got this far with failing (ie. matching t +he enclosed. \d+$ # then match if the whole string consists of digit +s. /x

      You could also avoid the need to repeat the end-of-string token ($) over and over by using a second set of non-capturing brackets like this:

      my $fail = qr/^(?!(?:7000|7777|7778|3886200|2200|8488|3406|9100|293899 +88|7688|5000|20|3408|3404|7648)$)\d+$/;

      This just says, starting at the beginning of the string, fail if everything between there and the end of the string matches anything in the inner set of brackets. If you don't fail, then match if the whole string consists of 1 or more digits.

      how does this work without a \d+ or a .* match in front ?

      (?!...) is a "zero-width assertion", which means that it matches (or fails in this case) without consuming any chars (ie. doesn't move pos). So, if you get to the other end of it without failing, then you are still attempting to match from the first char of the string, hence the \d+$ at the end takes care of everything you want to match.

      where do I learn more about all this ?

      Unfortunately, I'm unaware of any better resources than those you are already looking at. All the information is there, it just needs time and practice to wrap your brain around it. The way I learned was to spend literally days attempting to come up with regexes to match any given query that has come up here over the past 3-4 months. Slowly, you begin to recognise patterns in the way you do, or other people do things and it gets easier, though I'm still far from an expert. I arrive at most of mine through trial and error.

      It's also unfortunate, complex regexes are frowned on around here for the most part, otherwise I think that there would be more discussion/tutorials available on ways that this amazingly powerful language within a language can be used.

      It amazes me that the same people that advocate using emacs where you need to remember something like (and I'm possible exagerating slightly here:) Cntl-X!!cntl-J##-$$ in order to cut and paste a bit of text, find the terse notation of regexes so uncomfortable.


      Okay you lot, get your wings on the left, halos on the right. It's one size fits all, and "No!", you can't have a different color.
      Pick up your cloud down the end and "Yes" if you get allocated a grey one they are a bit damp under foot, but someone has to get them.
      Get used to the wings fast cos its an 8 hour day...unless the Govenor calls for a cyclone or hurricane, in which case 16 hour shifts are mandatory.
      Just be grateful that you arrived just as the tornado season finished. Them buggers are real work.