saranrsm has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I am very much amazed by the fastness of the index function which out throws any other function that returns the position of a substring.

So, All I wanted to know is what makes the index function faster, is there any algorithm on which index function has been built or it is some else which make index function faster.

(I have already asked this question in Chatter box, but as I couldn't find an answer i am posting this question)

Replies are listed 'Best First'.
Re: How Index function works??
by Corion (Patriarch) on Oct 24, 2011 at 07:36 UTC
      when I did speed tests comparing index and a simple m//, the regex (which uses Boyer-Moore) was (mostly) considerably faster.

      Since you are linking to v5.14.2 and I'm still using v5.10.0 I suppose that the implementation of index has changed.

      Cheers Rolf

        According to perlreguts, the RE engine also uses fbm_index() to scan for the leftmost atom. There shouldn't be any reason why the performance of the two should differ by a large margin, and I would expect the regular expression to be a bit slower in the general case due to the setup. So I think it's either that your data somehow favours a branch in the RE engine that goes to fbm_index faster, or that the benchmark is not measuring what you want. But I also vaguely remember some thread about such a discrepancy on this site, but I can't find it is index faster than regexp for fixed text token?.

        git blame tells me nobody touched index since 2009, and that change was some refcounting change. The other changes were in 2006.

        If you don't mind, could you post the benchmark code? I'm just curious to look it over.


        Dave

        LanX How big was your file?? I used file sized of 300 MB where for a pattern it took less than second but with regex it took over 5mins and I had to stop the script... I am using perl v5.14
Re: How Index function works??
by moritz (Cardinal) on Oct 24, 2011 at 07:37 UTC
      So what's the conclusion guys..does index uses Boyer moore algorithm or what??