in reply to RE: RE: Regular Expression Optimizer
in thread Regular Expression Optimizer
The execution time of a RE is dependant on both the RE and on the string it is working on (I think... I might have to think some more...). As such, the measure we are interested in is not T(r), but T(r,s).
I'll define an "optimal RE r for language L" to be an RE r that recognises L and T(r,s) <= T(r',s) for all r' recognising L and all s in L.
It's possible that there is a language G, such that an r, r' recognising G and s,s' in G exist with the property that T(r,s) <= T(r'',s) and T(r',s') <= T(r'',s') for all r'' recognising G, but T(r,s)<T(r',s) and T(r',s')<T(r,s'). In otherwords, r is the fastest for s but not for s', and vice versa. Which is the "optimal" one for G?
So "fastest" might not always be possible.
As for the other point... Assuming that there is always an RE for a language matching my definition of "optimal", then why can't there be two? Why can't we have r != r' but T(r,s) = T(r',s) for all s? Then which should the optimizer return? Or does it matter? I could very easily see R, R' that optimize into r, r' respecively, for the same language. Your argument isn't flawed due to the believe in a "most optimal" RE, just that you believe there is a unique "most optimal" RE for any language.
But it may be that in traditional REs (not Perl REs, which have...complications) the idea of a "fastest" RE may be nonsensical. The standard machines to match regular languages are DFAs, which have one state transition per input character. Therefore, the running time of any DFA is a function solely of the length of the input string, and not of the RE used to build it. "Optimal" DFA for a language usually refers to numbers of states, not speed.
What do you mean by your question about the REO machine being able to make only algorithmic manipulations to the input RE? What other sort of manipulations were you thinking of?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
RE (tilly) 4: Regular Expression Optimizer
by tilly (Archbishop) on Aug 31, 2000 at 06:12 UTC | |
|
RE:^4 Regular Expression Optimizer
by PsychoSpunk (Hermit) on Aug 31, 2000 at 08:30 UTC |