http://qs1969.pair.com?node_id=197325


in reply to Re: Re: Regex refresher
in thread Regex refresher

I think I'm missing something as well. My logic follows yours, and I got *almost* the same answers (in my head, anyway) as dws'. Ours differ firstly because he repeated 1111 ;), but more importantly on the number 101. I can't see that number matching, as 1(01*0)*1 as far as I can tell would match 1, then choosing to continue into the middle group, it would at the very least match 00, not? The only quantifier is on the 1, so as I see it, 00 needs to be matched, then the final 1, so that choosing to include the middle group would yield at minimum 1001. Thusly, I came up with:
"" 0 00 11 000 0000 1001 1111 00000 10101
Yet, I know that's not right (in hindsight). After looking over some other answers, I tried out 0011, 1100 and 0110, which all work out in practice, yet I can't figure out why. Is there some precedence issue that I'm missing?
Eh..I've got some time to figure it out though, I'm only a 3rd year undergrad ;)

Replies are listed 'Best First'.
Re: Re: Re: Re: Regex refresher (help?)
by zakb (Pilgrim) on Sep 13, 2002 at 09:14 UTC

    See FoxtrotUniform's answer above. 101 is not matched.

    As for your other problems, here's a solution to one of them which should help with the others:

    For reference, the original regex was:

    /^(0|1(01*0)*1)*$/

    Let's take your first example: 0011. 0 matches the first branch, and the final asterisk right at the end makes that greedy, so it also matches the second 0. 1 matches the second branch, the (01*0) matches nothing - which is ok because of the asterisk right after it. This effectively reduces the second branch to 1*1, which matches 11.

    The others (1100, 0110) are variations on this theme.

      g'ah! I know it's no excuse, but I was on my way out from work when I posted, and for some reason got it stuck in my head that the outer ()* would only match repetition, i.e. 0 could be matched, but if you matched the leading 0, then the next match would have to be 0. Likewise if the pattern matched 11, it would have to match from the right side of the | again. *sigh* ++zakb, I needed that. :P