in reply to Re^3: Out of Memory when generating large matrix
in thread Out of Memory when generating large matrix
These algorithms are all limited to elements of fixed length (which is the case), but Sundial suggested unix sort which knows nothing about the nature of the input.
And did you try to look into the space complexity of your suggestions?
Counting Sort requires 4**21 buckets. Seriously ???
Radix Sort is still the best in this respect, but has O(l*n) with l length of alphabet (here 21). Even worse identical elements are in worst case only identified in the last step, requiring to be able to hold all elements in memory, while shifting them all l times from bucket to bucket.
A hash count can go thru different files without keeping them all in memory, because identical elements collapse into one count.
Last but not least, a hash count is much easier implemented.
Cheers Rolf
(addicted to the Perl Programming Language and ☆☆☆☆ :)
Wikisyntax for the Monastery
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^5: Out of Memory when generating large matrix (space complexity)
by Anonymous Monk on Mar 07, 2018 at 13:25 UTC | |
by LanX (Saint) on Mar 07, 2018 at 14:07 UTC |