I *think* that a regex will apply Boyer-Moore automatically, whereas index will only apply it if you study the string. That's how I interpret these results:

for( 30, 300, 3000, 30000 ) { $s = 'x'x$_ . 'reg exp' . 'x'x$_; cmpthese -1, { REGEX => q[ $x = $s =~ /reg exp/;], INDEX => q[ $x = index $s, 'reg exp';] }; };; Rate INDEX REGEX INDEX 1956298/s -- -19% REGEX 2425347/s 24% -- Rate INDEX REGEX INDEX 340657/s -- -65% REGEX 973582/s 186% -- Rate INDEX REGEX INDEX 40348/s -- -74% REGEX 155495/s 285% -- Rate INDEX REGEX INDEX 3530/s -- -77% REGEX 15077/s 327% --

With study

for( 30, 300, 3000, 30000 ) { $s = 'x'x$_ . 'reg exp' . 'x'x$_; study $s; cmpthese -1, { REGEX => q[ $x = $s =~ /reg exp/;], INDEX => q[$x = index $s, 'reg exp';] } };; Rate REGEX INDEX REGEX 2101154/s -- -28% INDEX 2912801/s 39% -- Rate REGEX INDEX REGEX 829929/s -- -32% INDEX 1224656/s 48% -- Rate REGEX INDEX REGEX 100781/s -- -38% INDEX 161881/s 61% -- Rate REGEX INDEX REGEX 10983/s -- -26% INDEX 14792/s 35% --

Once you study the string, index wins out (marginally as the length of the string increases), which I attbribute to the saving of not having to invoke the regex engine. The saving decreases with length as the regex startup costs are amortised over the greater length.

Without the study, index does badly because it does a dumb search rather than an intelligent one. That makes regex get quicker with length.

But don't be surprised if dave_the_m or demerphq pop by and point out that I'm misinterpreting.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

In reply to Re^4: Performance optimization question by BrowserUk
in thread Performance optimization question by vit

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.