in reply to Would study help this regexp's performance?
in thread Surprisingly poor regex performance

study only really helps (ie. saves time) if you are going to be searching the studied string multiple times to offset the cost of the studying itself. And then only if your search term contains one or more characters that have rare occurance in the studied string.

I've wondered whether study could be updated to take a parameter n, where it then builds the table from groups of n chars, triplets being more unique than doublets, and they more so than individual chars.

Of course, the regex engine would need updating to make use of the information, and that's a very scary task to comtemplate.


Examine what is said, not who speaks.        The end of an era!
"But you should never overestimate the ingenuity of the sceptics to come up with a counter-argument." -Myles Allen
"Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo         "Efficiency is intelligent laziness." -David Dunham
"Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
  • Comment on Re: Would study help this regexps performance?

Replies are listed 'Best First'.
Re^2: Would study help this regexps performance?
by sgifford (Prior) on Dec 15, 2004 at 15:45 UTC
    study only really helps (ie. saves time) if you are going to be searching the studied string multiple times to offset the cost of the studying itself

    And in this case, I'm searching the string only once, so it almost certainly would have hurt rather than helped.

      I agree. I think that judicious use of the cut operator (?>...) may have helped your original regex avoid backtracking, but I haven't done any benchmarking to prove that thought.


      Examine what is said, not who speaks.        The end of an era!
      "But you should never overestimate the ingenuity of the sceptics to come up with a counter-argument." -Myles Allen
      "Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo         "Efficiency is intelligent laziness." -David Dunham
      "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon