in reply to Reading Reg Exp

You could also try YAPE::Regex::Explain. Thanks to Toolic for suggesting this module in a previous post. Here is an example using one of your regexes:

#!perl use strict; use warnings; use YAPE::Regex::Explain; print YAPE::Regex::Explain->new(qr/(?:\w+\s+fish\s+){2}(\w+)\s+fish/i) +->explain(); __END__ The regular expression: (?i-msx:(?:\w+\s+fish\s+){2}(\w+)\s+fish) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?i-msx: group, but do not capture (case-insensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- (?: group, but do not capture (2 times): ---------------------------------------------------------------------- \w+ word characters (a-z, A-Z, 0-9, _) (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- \s+ whitespace (\n, \r, \t, \f, and " ") (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- fish 'fish' ---------------------------------------------------------------------- \s+ whitespace (\n, \r, \t, \f, and " ") (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ){2} end of grouping ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- \w+ word characters (a-z, A-Z, 0-9, _) (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- \s+ whitespace (\n, \r, \t, \f, and " ") (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- fish 'fish' ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

Update: Link fixed.

Replies are listed 'Best First'.
Re^2: Reading Reg Exp
by JavaFan (Canon) on Aug 11, 2010 at 09:11 UTC
    \s+                      whitespace (\n, \r, \t, \f, and " ")
    That's actually incorrect. \s matches 25 different characters, although locale (and EBCDIC) can change the set of characters matched. Even in the LATIN-1 range, next line ("\x85") and no-break space ("\xA0") will be matched by \s if either the pattern or subject has the UTF-8 flag set.

      YAPE::Regex::Explain is probably only set up for the most common uses; since it uses YAPE::Regex to parse the regex, it probably can't detect encoding or locale. Since it is only providing an explanation of the regex, in most cases it wouldn't really matter.

        But even ignoring locale or encoding, it's still not listing 80% of the characters the class can match. That's like saying [a-z] matches all the vowels.