in reply to Re^4: suffix array efficiency
in thread suffix array efficiency

I'm not a bioinformatics guy, but if massive runs are actually a significant concern in this system, it's pretty easy to fix the worst-case behavior by adaptively boosting the window, so that worst case regresses to hdb's solution:
sub kennethk { my $string = shift; my $off = 0; my $step = 10; my @indices = sort {$off = 0; until (substr($string, $a+$off, $step) cmp sub +str($string, $b+$off, $step)){ $off += $step; $step *= 2; } + } 0 .. length($string) - 1; $_++ for @indices; return @indices; }
which gives a degenerate-case comparison of
RobertCraven 2.90/s kennethk 12.8/s xxx 13.1/s hdb 14.2/s
and comparable behavior in the rest of the benchmarks.

I also note your computer is twice as fast as mine. Wanna trade?


#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.