in reply to Re^3: Search a hash for closest match
in thread Search a hash for closest match

I think my response may have been a bit unclear/ambiguous:

there are only 928K positions in total, so we can roughly estimate 40K elements per array (1M sites/25 chromosomes) and only 1 array will have to be searched based on the chromosome given.

Replies are listed 'Best First'.
Re^5: Search a hash for closest match
by kennethk (Abbot) on Nov 01, 2011 at 18:47 UTC
    I followed what you said. Allow me to illustrate my point with an exchange I had with a colleague yesterday. He came into my office, asking me how to optimize sorting on a large set of objects. He thought perl was running pretty slowly, and this seemed to him to be the issue. We had a lovely discussion where I suggested he use the Orcish Maneuver as a solution for dropping the expense of his sort. I also suggested strongly that he should think about running his code through Devel::NYTProf to find out if the bottleneck was where he thought. And of course it wasn't, and his sort speed was completely irrelevant.

    Your proposed sorting algorithm is probably a good optimization. But make your script work first, then make it work quickly. The more complex the search algorithm, the easier it is to make a stupid mistake. Linear search may be slow, but it's damn hard to introduce an inobvious bug to it.