Would "index" be faster than using regular expression?

That would depend on a number of factors. I provided the Benchmark link so that you could determine this for yourself.

Before worrying too much about speed, ask yourself how important that is. If you can process all your data in 100ms, how much effort are you prepared to put in to get it to run in, say, half that time; and would anyone notice the difference. What are you doing with the results? Sending them to a terminal, a file, a database, a printer, across a network: all of these will probably take much longer than any processing occurring in the CPU.

If you're just looking for a function that searches for one string inside another, that's what index does and what I'd probably choose for that task; if patterns are involved, that's what a regexp engine does and, in that case, that's what I'd probably use.

If you do decide to optimise, you need to start with a regex that works correctly and consistently. toolic has provided code (in Re: Search for account number in a file name) that returns a correct result for your single example based on $account appearing anywhere within $filename. You didn't specify anything beyond this; however, I raised the issue that its position might be meaningful.

The regexp engine will typically find an anchored pattern faster than an unanchored one. If you know that $filename will always start with zero or more 0s immediately followed by $account, then you can write /^0*$account/ which would probably be faster than /$account/; if you know it will always start with exactly three zeros, then /^000$account/ may be faster still.

Similarly, you'll need to look at the code you're using with index. If you have information regarding the position of $account, then maybe, instead of index($filename, $account) >= 0, you'd want index($filename, $account) >= 3 or index($filename, $account) == 3 or something else. Perhaps you'd use the optional third argument: index($filename, $account, $position).

When you have two (or more) pieces of code that are working correctly, then you can compare them. That's when you'd use Benchmark.

-- Ken


In reply to Re^3: Search for account number in a file name by kcott
in thread Search for account number in a file name by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.