An algorithm A was shown to calculate the min_max of a set of 100 sortable elements in a set C in o(A,C) = 148 comparisons. It was maintained that this solution is optimal.
The basic idea
We'll is to show that any algorithm A applicable at n elements can be transformed into an algorithm A' solving the problem for n-2 elements. But in 3 steps less.
This leads to a contradiction if 148 wasn't minimal, because one needs at least 1 comparison for n=2 and (100-2)/2*3 = 147.
I.o.W repeated transformation of the algorithm would lead to a magical solution which works without comparisons. (one could also make this argument cleaner by induction, see below)
Construction of A'
Given a set C' of n elements, and an Algorithm A for n+2 elements.
We apply A on a expanded set with two new pseudo elements I : = {Inf, -Inf} which are extreme by definition.
C := C' + {Inf,-Inf}
Inf and -Inf have the property that they are always bigger resp. smaller with every comparison.
Since the result is always obvious we don't need explicit comparisons. We can short-circuit the step and feed the result immediately into the algorithm A.
- cmp( c , Inf) = smaller
- cmp( c ,-Inf) = bigger
- cmp(-Inf,Inf) = smaller (edge case of former)
But we keep records which elements in C' were compared with them.
Furthermore - without loss of generality - we also chose cmp(-Inf,Inf) for the first (pseudo) comparison in A - since we are free to chose were we start. (1)
After running thru the Algorithm A we have recorded C+ the subset of C' with n+ elements which were compared with Inf, C+ = { c in C' | cmp(c,Inf) }
It's easy to see that the second biggest element in C - which is max(C') - must have been compared directly with Inf and is in C+.
(Anyway, here the argument: In order to succeed, the algorithm A must have assured that
for all c in C: c < c1 < ... < cm < max(C).
But for the second biggest element in C there can't be any elements in between => m = 0 => a direct comparison must have happened.)
Hence we get the desired max(C') by continuing the algorithm to calculate max(C+) and this is trivially known to only need n+-1 comparisons. (2)
Analog with min(C') := min(C-) in n--1 comparisons (3)
In short the new algorithm is
A' := A + max(C+) + min(C-)
With (1),(2) and (3) we avoided 1 + n+ + n- pseudo-comparisons and needed n+ -1 and n- -1 extra comparisons.
=> o(A',C') = o(A,C) - 3
q.e.d
Generalization
this can be expanded to a general formula d(n) for the optimal number of steps needed for a set of size in
It's obvious that
We have just shown that d(n+2) >= d(n) + 3
(4)
OTOH we can always explicitly construct a new Algorithm A* for a set C* := C + {a,b} with two new elements a,b by adding 3 new comparisons to A (5)
- cmp(a,b) # determine the min and max of the new elements
- cmp( max{a,b}, max C ) # compare the max's
- cmp( min{a,b}, min C ) # compare the mins
With (4) and (5) we get an exact result d(n+2) = d(n) + 3 for all n
or more explicitly:
- d(n) = 3/2 n -2 ; for n even
- d(n) = 3/2 (n-1) ; for n odd
Meta
In a more formal proof I would have stated various lemmas and used induction and better notation. I would also have a better separation between statements and proofs.
But I hope it's somehow clear.