in reply to Re: A (non) reg-ex question
in thread A (non) reg-ex question

A supplementary reply, this time in the form of a question.

I am trying to understand your third method that uses the recursive regular expression. What I think it is doing is finding balanced "01" pairs in the same way as the Camel book example finds balanced "()" pairs. What I want to know is, how does the regular expression know when to stop recursing? Is the "?" quantifier something to do with it? It seems to me that the regular expression is consuming characters from both ends of the string at the same time. Am I missing something obvious?

Cheers,

JohnGG

Replies are listed 'Best First'.
Re^3: A (non) reg-ex question
by hv (Prior) on Mar 20, 2006 at 23:51 UTC

    I think my edition of the Camel predates qr{} so I doubt I have that example to hand, but I'd expect it to be much the same except that balanced parens can usually look like "()(())(()())" and the like.

    Except for certain special optimisations, a regular expression is always traversed from left to right. The critical thing to stop infinite recursion is to ensure that at least one character must be matched each time before recursing, ie that there be something to the left of the nested $re.

    In this case, ($re = qr{(?:0(??{$re})1)?}), we require a '0' to be matched before recursing. So "0011" =~ /^$re\z/ will be executed like:

    anchor to start try $re match '0' try $re match '0' try $re fail to find '0' group is optional, matched zero times $re matched '' match '1' group is optional, matched one time $re matched '01' match '1' group is optional, matched one time $re matched '0011' anchor to end match complete: matched '0011' from start

    Similarly, you could (foolishly) replace /0+/ with a recursive regexp like:

    $re = qr{0(??{$re})?};
    which would work, but the following would fall into infinite recursion:
    $re = qr{(??{$re})?0};

    Hope this helps,

    Hugo

      Thank you. That explains it very clearly. What a powerful feature. I think I'll go away and practice before I try to use this in anger.

      I have the Third Edition of the Camel book in which qr{} and (??{$re}) appear. I have to say that your explanation is clearer than the book.

      Cheers,

      JohnGG