in reply to Re^2: read and sort multiple files
in thread read and sort multiple files
Instead of using merge-sort alone ( as a plain approach ) a hybrid way of using in-memory sorting and merge-sort can be combined and used together.
That's what the post to which you replied already suggested.
Out of the total 'n' number of files, sort only 'm' files in memory
A 100MB file takes up pretty major chunk of memory already. Remember, if the array isn't preallocated to hold enough lines, twice the size of the data is needed.
If I were to re-implement the work in Perl, I'd probably do something equivalent to
Update: I struck out a statement that's probably wrong. There is overhead, but it should be proportional to the number of lines, not number of bytes.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: read and sort multiple files
by matrixmadhan (Beadle) on Dec 02, 2008 at 11:38 UTC | |
by ikegami (Patriarch) on Dec 02, 2008 at 15:02 UTC | |
by ikegami (Patriarch) on Dec 02, 2008 at 15:03 UTC | |
by matrixmadhan (Beadle) on Dec 05, 2008 at 18:47 UTC | |
by ikegami (Patriarch) on Dec 06, 2008 at 08:04 UTC | |
by matrixmadhan (Beadle) on Dec 06, 2008 at 14:52 UTC | |
|