in reply to Memory issue with large cancer gene data structure

$site_length_catch{$key1}[$key4]

Did you mean for that to be an array ref instead of a hash ref for $key4? Also, what values does $key4 take?

If $key4 happens to be a large number, then you'll blow away all your memory in one step as the array attempts to grow to ludicrous size.

Replies are listed 'Best First'.
Re^2: Memory issue with large cancer gene data structure
by ZWcarp (Beadle) on Jul 25, 2013 at 14:35 UTC

    $key4 = the amino acid position. It is derived from the first file and edited to remove non digits so that its just a numerical position

    cell $key4 is originally initialized as 0 and at this step it needs to be changed to the number of mutations at this site (the array length corresponds to the gene length of gene $key1 so each cell corresponds to an amino acid position in the gene). It is set equal to scalar ( @{$site{$key1}{$key4}} ); because this calculates the number of values at a mutation site, thus the size of this essentially opperates as a count for recurrent positions across different samples. The numbers for this are mostly 0-10 there are a few large like one 35K and a few below that.