in reply to Working on huge (GB sized) files
If the input files are in CSV, with the right options to the sort command, you can sort on an arbitrary field.
The system's sort command doesn't have to have all the data in memory at once and it will make temp files and do whatever it needs to do in order to sort this huge file. This can be faster than you might imagine. Your code only needs to deal with a small number of input lines at a time. Let system sort deal with the job of getting relevant records adjacent in the file.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Working on huge (GB sized) files
by vasavi (Initiate) on May 15, 2011 at 10:38 UTC |