in reply to Re: Regex refresher
in thread Regex refresher

Unless I'm hopelessly wrong, for starters I think you have step one wrong.

I think step one should be something like match either 0 or 1(01*0)*1. The final asterisk matches zero or more of one of the matched branch or atom so that's how it matches 00.

If this was your stumbling block, hopefully I've helped!

Replies are listed 'Best First'.
Re: Re: Re: Regex refresher (help?)
by charnos (Friar) on Sep 12, 2002 at 19:43 UTC
    I think I'm missing something as well. My logic follows yours, and I got *almost* the same answers (in my head, anyway) as dws'. Ours differ firstly because he repeated 1111 ;), but more importantly on the number 101. I can't see that number matching, as 1(01*0)*1 as far as I can tell would match 1, then choosing to continue into the middle group, it would at the very least match 00, not? The only quantifier is on the 1, so as I see it, 00 needs to be matched, then the final 1, so that choosing to include the middle group would yield at minimum 1001. Thusly, I came up with:
    "" 0 00 11 000 0000 1001 1111 00000 10101
    Yet, I know that's not right (in hindsight). After looking over some other answers, I tried out 0011, 1100 and 0110, which all work out in practice, yet I can't figure out why. Is there some precedence issue that I'm missing?
    Eh..I've got some time to figure it out though, I'm only a 3rd year undergrad ;)

      See FoxtrotUniform's answer above. 101 is not matched.

      As for your other problems, here's a solution to one of them which should help with the others:

      For reference, the original regex was:


      Let's take your first example: 0011. 0 matches the first branch, and the final asterisk right at the end makes that greedy, so it also matches the second 0. 1 matches the second branch, the (01*0) matches nothing - which is ok because of the asterisk right after it. This effectively reduces the second branch to 1*1, which matches 11.

      The others (1100, 0110) are variations on this theme.

        g'ah! I know it's no excuse, but I was on my way out from work when I posted, and for some reason got it stuck in my head that the outer ()* would only match repetition, i.e. 0 could be matched, but if you matched the leading 0, then the next match would have to be 0. Likewise if the pattern matched 11, it would have to match from the right side of the | again. *sigh* ++zakb, I needed that. :P