how to improve the performance of a regular expression

It's hard work! If it is important enough for you to spend your time on, (and you're a masocist!), then try enabling use re 'debug';. This will tell you FMTYEWTK, about what is going on inside the regex engine.

As for "Are there heuristics you've learned?", the most basic one is reScriptWithDebug > log & wc -l log. Lower numbers mean "more efficient"!

Essentially, the less log your re generates, the faster it will run (without the logging). Which of course "begs the question", which despite the purist lamentations, I'm going to ask on your behalf. "

How do you know which regex contructs will generate the least logging?

And the answer is: suck it and see. (Paraphrase: benchmark!). But to do that, you need a range of possible candidate regexes that meet your actual needs--as opposed to either your stated needs, or those assumed by the pedants.

Back around the 5.6.1 timeframe, Jeffery Friedl's book, Mastering regular expressions, would give you definitive answers to most regex questions--assuming that you could force yourself to read through, the rather dry, complete works. But, like life, Perl and it's regex engine, moved on. So now, the only way to optimise your regexes, is to benchmark them.

One possible approach is to realise that the looser (more permissive; least specific), the regex, the faster it is likely to run.

Another is to post a "please speed this up" post here at PM. But don your thick skin, cos ~70% of the replies will likely be "Optimisation is the route of all evil", knock offs! Without the proofs. Or understanding.

So, if you have just posted a generic "how to" question, in place of a specific, "I really need to speed up this regex" question, repost the latter. Make it a challenge! Offering mega-qodos.

Incite the competetive nature of the monks. Maybe, just maybe, you'll re-kindle the interest of some of those that used to make this place so much fun, but that have ceased to intereact here, because the pedants and PC-police have driven away those that made this place work in the first place.

The japhys, and dwss and aristotles. Those with whom you could have a technical argument, without personal afront. Those who's ingrained desire to know, supplanted the group-speak invictive for control and conformity.

We can but hope!


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"I'd rather go naked than blow up my ass"

In reply to Re: How do I optimize a regular expression? by BrowserUk
in thread How do I optimize a regular expression? by kyle

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.