in reply to Surprisingly poor regex performance

Would using study help here?

Note this is only a question, not a recommendation — I've never used study, but from its docs this sounds like a situation where it might help.

Smylers

Retitled by davido.
Re-retitled by Smylers, so that it's grammatically correct again (and I don't look like an idiot who doesn't know to use apostrophes).

Replies are listed 'Best First'.
Re: Would study help this regexps performance?
by BrowserUk (Patriarch) on Dec 15, 2004 at 10:41 UTC

    study only really helps (ie. saves time) if you are going to be searching the studied string multiple times to offset the cost of the studying itself. And then only if your search term contains one or more characters that have rare occurance in the studied string.

    I've wondered whether study could be updated to take a parameter n, where it then builds the table from groups of n chars, triplets being more unique than doublets, and they more so than individual chars.

    Of course, the regex engine would need updating to make use of the information, and that's a very scary task to comtemplate.


    Examine what is said, not who speaks.        The end of an era!
    "But you should never overestimate the ingenuity of the sceptics to come up with a counter-argument." -Myles Allen
    "Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo         "Efficiency is intelligent laziness." -David Dunham
    "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
      study only really helps (ie. saves time) if you are going to be searching the studied string multiple times to offset the cost of the studying itself

      And in this case, I'm searching the string only once, so it almost certainly would have hurt rather than helped.

        I agree. I think that judicious use of the cut operator (?>...) may have helped your original regex avoid backtracking, but I haven't done any benchmarking to prove that thought.


        Examine what is said, not who speaks.        The end of an era!
        "But you should never overestimate the ingenuity of the sceptics to come up with a counter-argument." -Myles Allen
        "Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo         "Efficiency is intelligent laziness." -David Dunham
        "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon