Your binary search finds an index that matches, not necessarily the first index that matches. So, yes, that was an accidental success. Your loop to search from that point on should stop early, not progress all the way to the end of the list. And your sub isn't properly encapsulated.

Last point first, your code:

my $start = bsearch( \@indices, $pattern );

indicates that your sub should actually start with something more like:

sub bsearch { my( $indices_hv, $pattern ) = @_

It would be better design if you didn't access global variables (or file-scoped lexicals) from this sub. To do so, you'd need to pass in at last one more variable.

You can catch this type of error by minimizing the executable code you place at the top level in your script. This is one of the advantages to the style that I still use, (tye)Re: Stupid question.

I don't see a clever way to make a binary search end up stopping on the first match. So I'd just leave your binary search as-is and next march backward until you find the first non-match and then march forward until you find the first non-match in that direction.

I'm guessing that sorting a list of indexes is a recommended approach for making the sorting more efficient (not moving around potentially long strings). But I bet it results in slightly slower Perl code (and more complex code as well).

I'd also remove the '$' entry immediately after populating that list, not after sorting. Or just change the while condition to, for example:

while( 1 < length $str ) {

- tye        


In reply to Re: searching for a pattern using suffix arrays (close) by tye
in thread searching for a pattern using suffix arrays by abualiga

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.