in reply to Removing backtracking from a .*? regexp

I think I understand what you're talking about, but I'm not sure. You see, the problem is the way you framed the question. The first line of the post suggests that you have a performance problem. In general, a performance problem is one in which systems are being overloaded or time constraints aren't being met.

From the rest of the post however, I get the impression that this is just something you do every now and then, and that its just a little slow and maybe you have to go make a cup of coffee while it runs.

The two are very different. If you are hunting this data regularly in a constrained environment, there are numerous techniques you can use to boost the performance of your search, including sorted character and pair indexes etc.

On the other hand, these will take effort to implement. If you're just getting bored of waiting and want a faster regex, the one you put in the update is probably going to be about it. You won't be able to remove the backtracking entirely, in a worst case example imagine you're looking for "teatea" in the string "teateteatea". Its going to backtrack no matter what regular expression you use to pull off the match.

The other viable alternative is to push all the data into a decent database and let it worry about it. All depends on how often the data changes and how often you do the searches.

It's a pity that you didn't specify those factors :/

  • Comment on Re: Removing backtracking from a .*? regexp