in reply to Re^2: What is the best approach to check if a position falls within a target range?
in thread What is the best approach to check if a position falls within a target range?

FWIW. I've a solution that matches 2e6 random integers (0 .. 1000) against 200,000 randomly generated ranges (0..700, 1..300) in 45 minutes using 3GB of ram.

Of course, of those 2e6 queries only 1000 are unique so it's doing 2000 more work than it needs to.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on Re^3: What is the best approach to check if a position falls within a target range?

Replies are listed 'Best First'.
Re^4: What is the best approach to check if a position falls within a target range?
by umasuresh (Hermit) on Feb 16, 2011 at 13:35 UTC
    Hi BrowserUk,
    Thanks much for all your replies. I did some major refactoring of the code these past few days and was able to achieve significant improvement in speed.
    Major changes are:
    1. split the target region into 24 chunks for 24 chromosomes (chr) and only loaded the chr of interest in memory.
    2. Converted the target hash to a target array. This caused a big gain in speed.
    UPDATE
    3.Divided target region into 8 chunks 10-12.5-25-50 and so on. Checked the if the query snp is in which chunk before assigning target status. Uma

      How long is your current processing taking?