in reply to RE: RE: RE: Regular Expression Optimizer
in thread Regular Expression Optimizer

I was getting a bit disillusioned with the possibility of finding the fastest as well. However, as it's stated in Camel 2E, once the expression matches, the Perl Engine quits. So while in general, the execution of the RE is dependent on the RE and the string which it matches (For the sake of this argument, although not true in all cases in real life, I'm saying that anything that doesn't match takes up n amount of time. This is simply done since any failed string will take up the most time for any RE whether optimized or not), there will eventually be a regex such that, for every valid string in the language checked by the regex, it will perform optimally. It will not perform every execution in t seconds for every case, since it depends on the string; but of two equivalent REs (that share the exact same language), this optimized one will always execute faster than the suboptimal one. Fastest may be a pipe dream in all cases, but it could be possible to return the faster of all possibilities. At which point, we're now getting into the limitations of the implementation rather than anything else.

I will concede that there may be a case where two equally optimal REs are available. This is entirely plausible, and I sometimes fall prey to a binary train of thought. Thus, while I've been reading replies to all of this, I've thought more as I go along, and since I did want to spur dialogue, I've let my thoughts wander down the ways everyone else is thinking to see how approachable this could be. Which of course leads me to your final two questions:

Obviously, this whole discussion of language theory has mathematical rules under which it operates. I think everyone here is aware of what a regular expression optimizer runs up against simply by the fact that no one was able to say "Oh, you want to go look at x." So, my question was simply if heuristic manipulations would be allowed to the input RE. Basically, what if during implementation, a few "leap of faith" type heuristics were involved, but they also performed remarkably well. Does that affect the reception of my REO machine into the community since it doesn't always act as expected mathematically? ALL HAIL BRAK!!!