in reply to Re: Inefficient regex
in thread Inefficient regex
Exactly. In more detail, what the original regex ends up doing goes like this:
which you can see wastes a lot of time.<%start_(.*?)\s*(.*?)\s*%>\n?(.+?)<%end_\1%>\n? $1- -A- $2- -B- $3- C- Quickly matches <%start_ Tries $1 matching "" Tries A matching "" Tries $2 matching "wibble XXX" Tries $3 matches to just before "<%end_flub%>" Is forced to backtrack by C Tries $3 matches to just before "<%end_wibble%>" Is forced to backtrack by C Tries $1 matching "w" Tries A matching "" Tries $2 matching "ibble XXX" Tries $3 matches to just before "<%end_flub%>" Is forced to backtrack by C Tries $3 matches to just before "<%end_wibble%>" Is forced to backtrack by C Tries $1 matching "wi" Tries A matching "" Tries $2 matching "bble XXX" Tries $3 matches to just before "<%end_flub%>" Is forced to backtrack by C Tries $3 matches to just before "<%end_wibble%>" Is forced to backtrack by C ... Tries $1 matching "wibble" Tries A matching " " Tries $2 matching "XXX" Tries $3 matching to just before "<%end_flub%>" Is forced to backtrack by C Tries $3 matches to just before "<%end_wibble%>" Succeeds finding one match
That is one reason why Death to dot star! suggests you use character class instead of . whenever possible in a regex.
- tye
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re^2: Inefficient regex (death to dot star)
by Wibble (Beadle) on Mar 05, 2003 at 21:07 UTC |