in reply to Re: A (non) reg-ex question
in thread A (non) reg-ex question

For amusement, here is the recursive regexp modified to match the case where there are as many 1's as 0's (in any order):
my $re; $re = qr{ (?: 0 # 0 (??{$re}) # Balanced 1's and 0's 1 # 1 | 1 # 1 (??{$re}) # Balanced 1's and 0's 0 # 0 )* }x; /^$re\z/;
A random note. For amusement I tried to optimize this, but that failed. I believe the following should be fast, but it isn't, and that's probably a bug (in my understanding, or in the RE engine):
my $re; $re = qr{ (?: 0 # 0 (?> # If I get here, do not backtrack! (??{$re}) # Balanced 1's and 0's ) 1 # 1 | 1 # 1 (?> # If I get here, do not backtrack! (??{$re}) # Balanced 1's and 0's ) 0 # 0 )* }x; for (qw(1001 11100 000111 00111), "101"x50) { if (/^$re\z/) { print "$_ matched\n"; } else { print "$_ did not match\n"; } }
Given the performance problems, I would not recommend using this regular expression in real life.

UPDATE: It was a bug in my understanding. The following works and is fast:

my $re; $re = qr{ (?: 0 # 0 (??{$re}) # Balanced 1's and 0's 1 # 1 | 1 # 1 (?> # If I get here, do not backtrack! (??{$re}) # Balanced 1's and 0's ) 0 # 0 )*? }x; for (qw(1001 11100 000111 00111), "101"x50) { if (/^$re\z/) { print "$_ matched\n"; } else { print "$_ did not match\n"; } }
And so is this:
my $re; $re = qr{ (?> # Avoid backtracking 0 # 0 (??{$re}) # Balanced 1's and 0's 1 # 1 | 1 # 1 (??{$re}) # Balanced 1's and 0's 0 # 0 )*? }x; for (qw(1001 11100 000111 00111), "101"x50) { if (/^$re\z/) { print "$_ matched\n"; } else { print "$_ did not match\n"; } }
The key is that you need to avoid having to backtrack on success.