in reply to Re^2: Reading Reg Exp
in thread Reading Reg Exp

YAPE::Regex::Explain is probably only set up for the most common uses; since it uses YAPE::Regex to parse the regex, it probably can't detect encoding or locale. Since it is only providing an explanation of the regex, in most cases it wouldn't really matter.

Replies are listed 'Best First'.
Re^4: Reading Reg Exp
by JavaFan (Canon) on Aug 11, 2010 at 10:45 UTC
    But even ignoring locale or encoding, it's still not listing 80% of the characters the class can match. That's like saying [a-z] matches all the vowels.

      According to the perlrecharclass manpage:

      \s matches any single character that is considered whitespace. In the ASCII range, \s matches the horizontal tab (\t), the new line (\n), the form feed (\f), the carriage return (\r), and the space.

      It also says:

      Without a locale or EBCDIC code page, \s matches the five characters mentioned in the beginning of this paragraph.

      Update: Link fixed.

        Both cases are talking about matching ASCII characters. The first mentioning ASCII, the second when it's discussing non-UTF8 matching.

        But the explain of the regex cannot know whether UTF-8 matching is in effect or not, as that will depend on the encoding of the subject string.

      Consonants, rather :)
        Out of 26 characters, 5 are vowels (ignoring the fact 'y' and Welsh 'w' sometimes play the role of vowel). Which is about 20% of the characters 'a' to 'z'. \s matches 25 characters, but the explain lists 5.

        So, I did mean vowels where I wrote vowels.