in reply to Conditional regex

use YAPE::Regex::Explain; die YAPE::Regex::Explain->new( '^(\w+)(\w+)?(?(2)\2\1|\1)$' )->explain; __END__ The regular expression: (?-imsx:^(\w+)(\w+)?(?(2)\2\1|\1)$) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- ^ the beginning of the string ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- \w+ word characters (a-z, A-Z, 0-9, _) (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- ( group and capture to \2 (optional (matching the most amount possible)): ---------------------------------------------------------------------- \w+ word characters (a-z, A-Z, 0-9, _) (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- )? end of \2 (NOTE: because you're using a quantifier on this capture, only the LAST repetition of the captured pattern will be stored in \2) ---------------------------------------------------------------------- (?(2) if back-reference \2 matched, then: ---------------------------------------------------------------------- \2 what was matched by capture \2 ---------------------------------------------------------------------- \1 what was matched by capture \1 ---------------------------------------------------------------------- | else: ---------------------------------------------------------------------- \1 what was matched by capture \1 ---------------------------------------------------------------------- ) end of conditional on \2 ---------------------------------------------------------------------- $ before an optional \n, and the end of the string ---------------------------------------------------------------------- ) end of grouping ---------------------------------------------------------------------- E:\>perl -Mre=debug -le"print 66666 if q[aa] =~ /^(\w+)(\w+)?(?(2)\2\1 +|\1)$/" Freeing REx: `,' Compiling REx `^(\w+)(\w+)?(?(2)\2\1|\1)$' size 34 first at 2 synthetic stclass `ANYOF[0-9A-Z_a-z]'. 1: BOL(2) 2: OPEN1(4) 4: PLUS(6) 5: ALNUM(0) 6: CLOSE1(8) 8: CURLYX[1] {0,1}(17) 10: OPEN2(12) 12: PLUS(14) 13: ALNUM(0) 14: CLOSE2(16) 16: WHILEM(0) 17: NOTHING(18) 18: GROUPP2(20) 20: IFTHEN(28) 22: REF2(24) 24: REF1(33) 26: LONGJMP(32) 28: IFTHEN(32) 30: REF1(33) 32: TAIL(33) 33: EOL(34) 34: END(0) floating `'$ at 1..2147483647 (checking floating) stclass `ANYOF[0-9A- +Z_a-z]' anchored(BOL) minlen 1 Guessing start of match, REx `^(\w+)(\w+)?(?(2)\2\1|\1)$' against `aa' +... Found floating substr `'$ at offset 2... Does not contradict STCLASS... Guessed: match at offset 0 Matching REx `^(\w+)(\w+)?(?(2)\2\1|\1)$' against `aa' Setting an EVAL scope, savestack=3 0 <> <aa> | 1: BOL 0 <> <aa> | 2: OPEN1 0 <> <aa> | 4: PLUS ALNUM can match 2 times out of 32767... Setting an EVAL scope, savestack=3 2 <aa> <> | 6: CLOSE1 2 <aa> <> | 8: CURLYX[1] {0,1} 2 <aa> <> | 16: WHILEM 0 out of 0..1 cc=140fb88 Setting an EVAL scope, savestack=8 2 <aa> <> | 10: OPEN2 2 <aa> <> | 12: PLUS ALNUM can match 0 times out of 32767... Setting an EVAL scope, savestack=8 failed... restoring \2..\2 to undef failed, try continuation... 2 <aa> <> | 17: NOTHING 2 <aa> <> | 18: GROUPP2 2 <aa> <> | 20: IFTHEN 2 <aa> <> | 30: REF1 failed... failed... failed... 1 <a> <a> | 6: CLOSE1 1 <a> <a> | 8: CURLYX[1] {0,1} 1 <a> <a> | 16: WHILEM 0 out of 0..1 cc=140fb88 Setting an EVAL scope, savestack=8 1 <a> <a> | 10: OPEN2 1 <a> <a> | 12: PLUS ALNUM can match 1 times out of 32767... Setting an EVAL scope, savestack=8 2 <aa> <> | 14: CLOSE2 2 <aa> <> | 16: WHILEM 1 out of 0..1 cc=140fb88 2 <aa> <> | 17: NOTHING 2 <aa> <> | 18: GROUPP2 2 <aa> <> | 20: IFTHEN 2 <aa> <> | 22: REF2 failed... failed... failed... restoring \2..\2 to undef failed, try continuation... 1 <a> <a> | 17: NOTHING 1 <a> <a> | 18: GROUPP2 1 <a> <a> | 20: IFTHEN 1 <a> <a> | 30: REF1 2 <aa> <> | 33: EOL 2 <aa> <> | 34: END Match successful! 66666 Freeing REx: `^(\w+)(\w+)?(?(2)\2\1|\1)$'
If you look at the output of -Mre=debug, you'll see that the REx engine first matches $1='aa', then backtracks until $1='a', so that it can then match \1. You can read more about backtracking in perlre.

I wouldn't say the 2nd \w+ is useless because the intent seems to be to try to match stuff like "OneTwoTwoOne".

MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
** The third rule of perl club is a statement of fact: pod is sexy.