To illustrate how the difference between quick sort and merge sort can be huge, this is a small benchmark of the two algorithms implemented in pure Perl, sorting 5000 integers 80 times.
With an array in which the numbers have been generated with the random function:
So, with random input, my own pure Perl implementation of quick sort is more than twice faster than my implementation of merge sort.$ perl bench_sort.pl Benchmark: timing 80 iterations of merge, quick... merge: 3 wallclock secs ( 3.76 usr + 0.00 sys = 3.76 CPU) @ 21 +.28/s (n=80) quick: 2 wallclock secs ( 1.53 usr + 0.00 sys = 1.53 CPU) @ 52 +.32/s (n=80)
Now, lo and behold, see what happens if I run the same algorithms on an already sorted array:
Quick sort is now 58 times slower than merge sort, and about 130 times slower than on randomly generated data.$ perl bench_sort.pl Benchmark: timing 80 iterations of merge, quick... merge: 3 wallclock secs ( 3.40 usr + 0.00 sys = 3.40 CPU) @ 23 +.53/s (n=80) quick: 199 wallclock secs (198.93 usr + 0.12 sys = 199.06 CPU) @ + 0.40/s (n=80)
The same happens if the data is in reverse order:
$ perl bench_sort.pl Benchmark: timing 80 iterations of merge, quick... merge: 3 wallclock secs ( 3.56 usr + 0.00 sys = 3.56 CPU) @ 22 +.50/s (n=80) quick: 199 wallclock secs (197.97 usr + 0.06 sys = 198.03 CPU) @ + 0.40/s (n=80)
Admittedly, such pathological behavior of quick sort is very rare with random data, but having to sort data that is almost sorted is a very common task in real life. A typical example is the Association of Tennis Professionals (ATP) rankings: new rankings are published once a week or so. From one week to the next there will usually be relatively small differences. So that if you take the ranking of last week, update the points of each player and do the sorting again, the data is almost sorted. Quick sort would be very bad for that (actually bubble sort would do much better in such a situation). The same would happen when recalculating the index of a database table after having done a few additions or deletions. It has been suggested that, for quick sort to be efficient, one need to have a first phase of randomization of the array to be sorted.
In reply to Re^2: sort behavior explanation required
by Laurent_R
in thread sort behavior explanation required
by ghosh123
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |