Brethren,

In a recent post it was asked how to split a string against an escaped delimiter.

The subject of unsupported variable length lookbehind was broached, and then solved in a novel way, using variable length lookahead and reverse. My own solution accomplished the task without lookaround assertions at all, but rather uses a straightforward application of unrolling the loop.

So I decided to Benchmark my solution against the clever one, with the usual module, and reported the results here. (I was sorta hoping that mine was faster since contemplating a daily habit of regex matching on reversed strings made me feel sea-sick).

The results were interesting! It turns out, at least with regard to this problem, that variable length lookahead on the reverse string is about 10% faster than unrolling the loop, despite the number of calls made to reverse.

Why is this? Does this have something to do with the internal optimizers? Does the Regex engine squeeze that much more juice out of variable width lookahead than alternation? Is it a matter of regex compilation? Is this question misguided by something else altogether?

In reply to Seeking Audience with the Perl Regex Droids... by mobiusinversion

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.