in reply to A question of 'or' in a Regex

1. Is this using too many paren statements, or is this just per coder's discretion?
I'd say it's too many groupings but it's similar to using parens in code to distinguish expressions, explicit and potentially confusing. Personally I'd simplify it to /C (?:800 | (?:35 | 29 )(?:50 | 00XL))/x. Also the regex doesn't do quite what you think - it matches C800 or (?:(?:35|29)(?:50|00XL)) (hence the slightly modified regex I presented).
2. This regex matches on 'C3550XL' and I'm not sure why
It matches for the same reason that ABC3550DEF will match - the regex isn't anchored. So what you'll want is something like /\AC(?:800|(?:(?:35|29)(?:(?:50)|(?:00XL))))\z/ to get the expected match. See perlre for more info on \A and \z.
HTH

_________
broquaint

Replies are listed 'Best First'.
Re: Re: A question of 'or' in a Regex
by diotalevi (Canon) on May 28, 2003 at 15:26 UTC

    You parsed it incorrectly. (I parsed broquaint's regex incorrectly). By just adding whitespace I get the following:

    / C (?: 800 ) | (?: (?: 35 | 29 ) (?: (?: 50 ) | (?: 00XL ) ) ) /x

    Which shows that some (?:) were completely useless. This imposes zero runtime overhead so it only really matters because it makes it harder for the next programmer to read. In removing the useless (?:) groups I get.

    / C800 | (?: 35 | 29 ) (?: 50 | 00XL ) /x