It breaks the regex into three parts: single-quoted strings, double-quoted strings, and all others. The single- and double-quoted string parts are very similar. The logic used is:
And here's the explain output:$REx = qr{ ' (?> [^'\\?]* ) (?: (?: (?: \\ | \?\?/ ) . | \?\?' | \? (?! \? ['/] ) ) (?> [^'\\?]* ) )* ' | " (?> [^"\\?]* ) (?: (?: (?: \\ | \?\?/ ) . | \?\?' | \? (?! \? ['/] ) ) (?> [^"\?]* ) )* " | (?: (?! / [/*] ) (?: \?\?['/] | \? (?! \? ['/] ) | (?> [^?'"\s]+ ) ) )+ }x;
(?x-ims: # group, but do not capture (disregarding # whitespace and comments) (case-sensitive) # (with ^ and $ matching normally) (with . not # matching \n): ' # '\'' (?> # match (and do not backtrack afterwards): [^'\\?]* # any character except: ''', '\\', '?' (0 # or more times (matching the most amount # possible)) ) # end of look-ahead (?x: # group, but do not capture (0 or more times # (matching the most amount possible)): (?x: # group, but do not capture: (?x: # group, but do not capture: \\ # '\' | # OR \? # '?' \? # '?' / # '/' ) # end of grouping . # any character except \n | # OR \? # '?' \? # '?' ' # '\'' | # OR \? # '?' (?! # look ahead to see if there is not: \? # '?' ['/] # any character of: ''', '/' ) # end of look-ahead ) # end of grouping (?> # match (and do not backtrack afterwards): [^'\\?]* # any character except: ''', '\\', '?' # (0 or more times (matching the most # amount possible)) ) # end of look-ahead )* # end of grouping ' # '\'' | # OR " # '"' (?> # match (and do not backtrack afterwards): [^"\\?]* # any character except: '"', '\\', '?' (0 # or more times (matching the most amount # possible)) ) # end of look-ahead (?x: # group, but do not capture (0 or more times # (matching the most amount possible)): (?x: # group, but do not capture: (?x: # group, but do not capture: \\ # '\' | # OR \? # '?' \? # '?' / # '/' ) # end of grouping . # any character except \n | # OR \? # '?' \? # '?' ' # '\'' | # OR \? # '?' (?! # look ahead to see if there is not: \? # '?' ['/] # any character of: ''', '/' ) # end of look-ahead ) # end of grouping (?> # match (and do not backtrack afterwards): [^"\?]* # any character except: '"', '\?' (0 or # more times (matching the most amount # possible)) ) # end of look-ahead )* # end of grouping " # '"' | # OR (?x: # group, but do not capture (1 or more times # (matching the most amount possible)): (?! # look ahead to see if there is not: / # '/' [/*] # any character of: '/', '*' ) # end of look-ahead (?x: # group, but do not capture: \? # '?' \? # '?' ['/] # any character of: ''', '/' | # OR \? # '?' (?! # look ahead to see if there is not: \? # '?' ['/] # any character of ''', '/' ) # end of look-ahead | # OR (?> # match (and do not backtrack # afterwards): [^?'"\s]+ # any character except: '?', ''', '"', # whitespace (\n, \r, \t, \f, and " ") # (1 or more times (matching the most # amount possible)) ) # end of look-ahead ) # end of grouping )+ # end of grouping ) # end of grouping
In reply to Re: really large regex misbehaving - WTF
by japhy
in thread really large regex misbehaving
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |