in reply to Re: Regex help
in thread Regex help

The 1 while resursive subsitution trick is useful for this sort of problem. See my example above. I prefer a negated char class ie [^\]] in this example to an un-greedy .+? as it saves backtracking +/- improves accuracy as it is slightly more specific and it allows \n for example where . does not by default.

Lots of ways to skin the cat, provided we can make a nice tasty stew TIMTOWDI.

cheers

tachyon

Replies are listed 'Best First'.
Re^3: Regex help
by ysth (Canon) on Aug 01, 2004 at 05:54 UTC
    I'd guess that your regex is still going to do a fair amount of backtracking. I'd say (?>(\w+))[^\]]* or (\w+)(=[^\]]*)? (untested).

    Update: this isn't just a backtracking issue; tachyon's original regex will match things like [color=Red][/col].

      You are probably right and as noted there are edge cases, as with all these sorts of things. Regardless of backtracking it will hit most strings at least twice. Given that I (at least) am unfamiliar with widgets that use this formatting spec I just put in a general suggestion. One of the great things about this site is that just about any hole/edge will be pointed out. Everyone learns. Something like you suggest that accurately deals with the 'blah' and 'blah=foo' forms (assumming they are the only options) with some \s* tokens to allow for whitespace issues is a little more robust. It is a pretty ugly RE but.....gotta hate metachars as formating tokens.