Character class abbreviations allow you to match any of a set of characters without too much hassle.
One way to do this is to put the set of characters you want to match from within [].
For instance [0123456789] would allow you to match any of those numbers. This can be
kind of cumbersome. You can also negate a character class by placing a caret at the front of it. For
instance [^0123456789] matches anything that is not a number. You shouldn't be surprised that Perl makes your life much easier by
defining some character class a bbreviations. These are alphanumeric characters preceded by a
backslash. Perl allows you to match any number with a \d in your regular expression.
Now for a quick word about metacharacters. Metacharacters are characters that have special meaning within regular
expressions. Therefore if you put them into a regular expression they won't match literally. Unless you precede the
metacharacter with a \. The metacharacters are \|()$^.?* Now for a quick word about each of them do before
we return to character class abbreviations.
Metacharacter(s) | Meaning |
. | Matches any character besides newline |
() | Used for grouping characters |
[] | Used for defining character classes |
| | Used for or in regular expression |
\ | Denotes the beginning of a character class abbreviation, or for the following metacharacter to be matched literally |
* | Quantifier matches 0 or more of the previous character or group of characters |
? | Makes a quantifier nongreedy |
^ | Matches the beginning of a string (or line if /m is used) |
$ | Matches the end of a string (or line if /m is used) |
Now lets define some character classes
Character Class | Meaning |
\d | digit or [0123456789] |
\D | nondigit or [^0123456789] |
\w | word (alphanumeric) or [a-zA-Z_0-9] |
\W | nonword |
\b | word boundary |
\s | whitespace character [ \t\r\n\f] |
\S | non whitespace character |
That's a lot of information to get a handle on. So lets check out
pattern-matching examples