in reply to Merge 2 or more logs, sort by date & time

Hi ImJustAFriend,

presumably, your individual log files are already sorted in chronological order.

In this case (and assuming you have a lot of data), merging sorted files by dates is usually much faster than sorting a concatenation of all the records. This idea is even the basic principle of the merge sort algorithm, one of the best sort algorithms available.

Basically, assuming you have only two input files, you read them in parallel and pick from either the next candidate, and do it again. Each record is processed only once, and this is is way faster than any standard sort algorithm.

If you have more than two input files, it becomes slightly more complicated, but it is not so complex to merge files two by two until they have all been used.

  • Comment on Re: Merge 2 or more logs, sort by date & time