in reply to RegEx: Why is [.] not a valid character class?
Quoting from Jeffrey Friedl's excellent book Mastering Regular Expressions, 2nd Ed.:
Usually, dot does not match a newline. The original Unix regex tools worked on a line-by-line basis, so the thought of matching a newline wasn't even an issue until the advent of sed and lex. By that time, '.*' had become a common idiom to match "the rest fo the line," so the new languages disallowed it from crossing line boundaries in order to keep it familiar.1 Thus, tools that could work with multiple lines (such as a text editor) generally disallow dot from matching a newline. (Mastering Regular Expressions, Second Edition, p. 110.)
1As Ken Thompson (ed's author) explained it to me, it kept '.*' from becoming "too unwieldy." (Mastering Regular Expressions, Second Edition, p. 110.)
I strongly suggest this book for those fighting with regular expressions. It's a complete, well-written reference to the topic and it gives excellent examples. Furthermore, it addresses regular expressions as they relate to several languages including Perl, PHP, JavaScript, Java, and .NET among others.
HTH,
/Larry
|
|---|