comment on

To illustrate how the difference between quick sort and merge sort can be huge, this is a small benchmark of the two algorithms implemented in pure Perl, sorting 5000 integers 80 times.

With an array in which the numbers have been generated with the random function:

$ perl bench_sort.pl
Benchmark: timing 80 iterations of merge, quick...
     merge:  3 wallclock secs ( 3.76 usr +  0.00 sys =  3.76 CPU) @ 21
+.28/s (n=80)
     quick:  2 wallclock secs ( 1.53 usr +  0.00 sys =  1.53 CPU) @ 52
+.32/s (n=80)
[download]

So, with random input, my own pure Perl implementation of quick sort is more than twice faster than my implementation of merge sort.

Now, lo and behold, see what happens if I run the same algorithms on an already sorted array:

$ perl bench_sort.pl
Benchmark: timing 80 iterations of merge, quick...
     merge:  3 wallclock secs ( 3.40 usr +  0.00 sys =  3.40 CPU) @ 23
+.53/s (n=80)
     quick: 199 wallclock secs (198.93 usr +  0.12 sys = 199.06 CPU) @
+  0.40/s (n=80)
[download]

Quick sort is now 58 times slower than merge sort, and about 130 times slower than on randomly generated data.

The same happens if the data is in reverse order:

$ perl bench_sort.pl
Benchmark: timing 80 iterations of merge, quick...
     merge:  3 wallclock secs ( 3.56 usr +  0.00 sys =  3.56 CPU) @ 22
+.50/s (n=80)
     quick: 199 wallclock secs (197.97 usr +  0.06 sys = 198.03 CPU) @
+  0.40/s (n=80)
[download]

Admittedly, such pathological behavior of quick sort is very rare with random data, but having to sort data that is almost sorted is a very common task in real life. A typical example is the Association of Tennis Professionals (ATP) rankings: new rankings are published once a week or so. From one week to the next there will usually be relatively small differences. So that if you take the ranking of last week, update the points of each player and do the sorting again, the data is almost sorted. Quick sort would be very bad for that (actually bubble sort would do much better in such a situation). The same would happen when recalculating the index of a database table after having done a few additions or deletions. It has been suggested that, for quick sort to be efficient, one need to have a first phase of randomization of the array to be sorted.

In reply to Re^2: sort behavior explanation required by Laurent_R
in thread sort behavior explanation required by ghosh123

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.