in reply to Re^9: Storing large data structures on disk
in thread Storing large data structures on disk

@res is in the order of 10^5-10^6. Each result "object" is just a small hash with just a few fields in it.

Now that I think about it again, I might want to add an extra level to my AoA - each nucleotide could have multiple (10^3) different list of experiments, not only one set. The reason is we also simulate "random experimental results" so we will actually have multiple @res arrays.

  • Comment on Re^10: Storing large data structures on disk

Replies are listed 'Best First'.
Re^11: Storing large data structures on disk
by BrowserUk (Patriarch) on Jun 01, 2010 at 08:23 UTC

    And is 200 an upper limit to the number of indices for one nucleotide for one experiment?

      That's correct, although it's not so strict (there is some distribution of the number of results we have, it could be a bit higher).

        So it's actually a 2D array (nucleotide, experiment set) (10^10 * 10^3) where each cell points to a list of varying length (of up to, let's say, ~200) of integers.

        Besides that we have 10^3 arrays, each of size 10^5-10^6 (all of the same length, though), each array represents a specific experiment set. Each cell in such an array holds a reference to a small hash (a specific result).