in reply to Reduce RAM required
Aren't the three letters in your sequences basically equiprobable (you have as many 'a', 't', 'g' and 'c' roughly appear as many times as each other)? If so, do you need the output file to have exactly the same number of occurence of each letter as the input? I'm not a biologist but the arbitrary - yet quite large - window size and shuffling makes me doubt the data is supposed to be meaningful in any way. Besides:
# throw in some reverse sequence alternatively to shuffle it more randomlyis useless at best. Trying to increase the randomness of some data without external input is either going to have no effect on the probabilities, or most likely make the output less random.
If you don't care about matching exactly the occurence of each letter, your program just becomes "replace each sequence by a random sequence the same length", which can be coded in a few lines. If you do care, hdb's answer might be the way to go.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Reduce RAM required
by onlyIDleft (Scribe) on Jan 09, 2019 at 16:27 UTC |