in reply to Recursive Regex: Response
in thread Parsing using Regex and Lookahead

Okay so after looking at it, it looks like: ([^\]]+) is basically saying, match anything not a close-bracket. I would use a (*) instead of a (+) for the reason of the empty div discussed above.

I question whether (?>) might be of use


Additionally, I don't know how I would stream the text. It's a parameter passed from a webform.

Replies are listed 'Best First'.
Re^2: Recursive Regex: Response
by furry_marmot (Pilgrim) on Mar 19, 2009 at 20:26 UTC
    Sorry not to reply sooner, but I've been busy.

    Backtracking is simply what the regex engine does when it can't make a match, but still has other items to consider. A simple example is if you want to match /(this|that).*(these|those)/, the engine first looks for a 't', then an 'h', etc. If it finds an 'n' after 'thi', then it backtracks to see if it can match 'that'. In this case, though it might not be nice to look at, breaking it out into four regexes (/this.*these/, /this.*those/, etc) turns out to be more efficient than the alternation version because if it fails to find 'this', for example, it simply fails without trying additional matches.

    Anyway, (?>...) is a way to cut off backtracking for hairy regexes. It can make parsing a lot faster.