in reply to Reduce RAM required

It looks to me as if - within each $window - you only need to count the occurrence of each of the 4 letters and then write a random sequence of those letters with the correct frequency of each. This only needs a bit of book keeping when reading and writing the files but should consume far less memory in the order of 4 times the number of windows.

But then I may not really have understood your shuffling requirements...

Replies are listed 'Best First'.
Re^2: Reduce RAM required
by onlyIDleft (Scribe) on Jan 09, 2019 at 16:10 UTC

    I agree with your approach of counting letters and using their frequencies

    About frequency of A/T/C/G agree again - but but I would much prefer to calculate it across ALL sequences, rather than just 1 sequence at a time - corresponding to one $ID, so that this frequency-compliant sequence randomization is based on global rather than local frequencies. I suspect it will result in even more reduced signal / noise ratio, though I have not tested that yet... Thank you!