in reply to why this regular expression is so slow?
Simply changing "\S" to "\w" converts a regex that takes forever to an instant match (at least this is so on my machine - Debian Etch, Perl 5.8.8).
The reason is that \S and [\.-]* can both match "." and "-". Without a distinctive character to match on, each time a match fails, the regex engine needs to backtrack and try the other regex. And there are a lot of "." and "-" to provide opportunities for backtracking. Additionally, because [\.-]* is greedy and can match an indefinite number of elements, the regex engine may have to process a large portion of a very long string before it discovers that a match has failed.Changing \S to \w results in a mutually exclusive match, thus eliminating the majority of backtracking and producing an instant match.
Best, beth
Note: for those who don't want to hunt down an ASCII table [\.-] is [\.-].
|
|---|