Re^3: regexp question

Replies are listed 'Best First'.
Re^4: regexp question by ramprasad27 (Sexton) on Oct 28, 2011 at 09:51 UTC
I mean to ask `[\d] \d` [download]	[reply] [d/l]
Re^5: regexp question by Anonymous Monk on Oct 28, 2011 at 11:20 UTC
Those are the same. The whole point of something like \d is to be a convenient abbreviation. It would be irritating (and error prone) if you had to spell it out in full each time it was used inside []	[reply]
Re^5: regexp question by furry_marmot (Pilgrim) on Oct 28, 2011 at 20:02 UTC
You're missing the whole point here. Square brackets are a character class. If I try to match on `[a-z0-9]`, I'm specifying one character that falls in the class of characters from a-z and 0-9. That is, I'm trying to match one character that could be any of those in the class. But if I try to match on `[.]`, I'm specifying one character that falls in the class of characters that *are* a period. In other words, `[a-z]` could match 'a', or 'b', or 'c', etc., but `[.]` can only ever match `'.'`. So `[.]` is exactly equal to `'.'` Thus it's a useless use of a character class. To use another example of yours, `[\d\s]` will match one character that is either a digit or a space character. It could match 9, or 8, or ' '. `\d` and `\s` retain their "magic" even in a character class. `[\d] = \d = [0-9]` The lesson here is don't use single-character classes. --marmot	[reply] [d/l] [select]
Re^6: regexp question by choroba (Cardinal) on Oct 28, 2011 at 21:10 UTC
`[\d] = \d = [0-9]` Not exactly true. See perlrecharclass: "\d" matches a single character that is considered to be a digit. What is considered a digit depends on the internal encoding of the source string and the locale that is in effect. If the source string is in UTF-8 format, "\d" not only matches the digits '0' - '9', but also Arabic, Devanagari and digits from other languages. Otherwise, if there is a locale in effect, it will match whatever characters the locale considers digits. Without a locale, "\d" matches the digits '0' to '9'. See "Locale, EBCDIC, Unicode and UTF-8".	[reply] [d/l]