Yes, that's right.
Sorting on the split chunks isn't free as well as there is additional 'n-1' open and close involved with the extra 'n-1' files when working with split chunks of a file | [reply] |
What extra open and closes? Grouping files greatly cuts down on the number of opens and closes.
-
In the case of a few small files, say 10 files, 100 of which fit in memory at a time.
-
In the case of lots of small files, say 100 files, 10 of which fit in memory at a time.
-
In the case of big files, it simply won't work without splitting.
You could cut down on the number of opens and closes by merging more than two files at a time, but it's not relevant here because it would apply to both grouping and non-grouping algorithms.
| [reply] [d/l] |
I think I haven't understood what you are trying to say.
This is what I thought, 10 files - 100 of which can fit in memory which means that
all the 10 can be loaded into memory at once.
10 open operations for 10 files
load data in to memory from all the 10 files
10 close operations for 10 files
sort everything in memory itself
1 - open for output file
flush sorted data from memory to output file
1 - close for output file
For the above example, its 2 * ( n + 1 ) close and open operations
So, in this case this doesn't tally with what you have said.
Apologies again, if I haven't understood what you meant.
| [reply] [d/l] |