in reply to 1. Go compare! Guardian's algortithm riddle and mathematical proof

Here the promised proof.

An algorithm A was shown to calculate the min_max of a set of 100 sortable elements in a set C in o(A,C) = 148 comparisons. It was maintained that this solution is optimal.

I'm shying away from a full formal proof, because of

A sketch should be sufficient, feel free to ask if something is unclear

The basic idea

We'll is to show that any algorithm A applicable at n elements can be transformed into an algorithm A' solving the problem for n-2 elements. But in 3 steps less.

This leads to a contradiction if 148 wasn't minimal, because one needs at least 1 comparison for n=2 and (100-2)/2*3 = 147.

I.o.W repeated transformation of the algorithm would lead to a magical solution which works without comparisons. (one could also make this argument cleaner by induction, see below)

Construction of A'

Given a set C' of n elements, and an Algorithm A for n+2 elements.

We apply A on a expanded set with two new pseudo elements I : = {Inf, -Inf} which are extreme by definition.

C := C' + {Inf,-Inf}

Inf and -Inf have the property that they are always bigger resp. smaller with every comparison.

Since the result is always obvious we don't need explicit comparisons. We can short-circuit the step and feed the result immediately into the algorithm A.

But we keep records which elements in C' were compared with them.

Furthermore - without loss of generality - we also chose cmp(-Inf,Inf) for the first (pseudo) comparison in A - since we are free to chose were we start. (1)

After running thru the Algorithm A we have recorded C+ the subset of C' with n+ elements which were compared with Inf, C+ = { c in C' | cmp(c,Inf) }

It's easy to see that the second biggest element in C - which is max(C') - must have been compared directly with Inf and is in C+.

(Anyway, here the argument: In order to succeed, the algorithm A must have assured that

for all c in C: c < c1 < ... < cm < max(C).

But for the second biggest element in C there can't be any elements in between => m = 0 => a direct comparison must have happened.)

Hence we get the desired max(C') by continuing the algorithm to calculate max(C+) and this is trivially known to only need n+-1 comparisons. (2)

Analog with min(C') := min(C-) in n--1 comparisons (3)

In short the new algorithm is

A' := A + max(C+) + min(C-)

With (1),(2) and (3) we avoided 1 + n+ + n- pseudo-comparisons and needed n+ -1 and n- -1 extra comparisons.

=> o(A',C') = o(A,C) - 3

q.e.d

Generalization

this can be expanded to a general formula d(n) for the optimal number of steps needed for a set of size in

It's obvious that

We have just shown that d(n+2) >= d(n) + 3 (4)

OTOH we can always explicitly construct a new Algorithm A* for a set C* := C + {a,b} with two new elements a,b by adding 3 new comparisons to A (5)

With (4) and (5) we get an exact result d(n+2) = d(n) + 3 for all n or more explicitly:

Meta

In a more formal proof I would have stated various lemmas and used induction and better notation. I would also have a better separation between statements and proofs.

But I hope it's somehow clear.

Cheers Rolf
(addicted to the Perl Programming Language :)
see Wikisyntax for the Monastery

  • Comment on Re: 1. Go compare! Guardian's algortithm riddle and mathematical proof (PROOF)

Replies are listed 'Best First'.
Re^2: 1. Go compare! Guardian's algortithm riddle and mathematical proof (PROOF)
by jo37 (Curate) on Jun 11, 2025 at 20:56 UTC

    Maybe I don't understand your proof correctly or it has a flaw.

    In the three comparisons you propose between the solutions for C and C' you assume there must be a comparison between min and max. This is rarely the case in the implementations regarded as optimal. Thus your proof excludes the relevant algorithms.

    Disproving the existence of a certain algorithm is indeed hard. You must not make any assumptions about its function, otherwise you may exclude some algorithms and the proof becomes worthless.

    First of all, there is a strong need do define what that algorithm shall be. In this specific case I'd define a solving algorithm as follows:

    • It is an nondeterministic state machine.
    • The initial input is the number of items.
    • The state machine then outputs pairs (i, k) of indices and expects the result of m(i) <=> m(k) as the next input.
    • The machine stops after S steps and produces a directed graph G = (V, E) as output with these properties:
      • The vertices V are the indices 1 .. N.
      • There is an edge i -> k in E if and only if m(i) < m(k) has been directly proven in the running phase of the machine.
      • The graph G has one source in min = min m and one sink in max = max m.

    There should be a proof that there is no algorithm that produces a valid output with S < 2 N/3 - 2 comparisons. Such a proof is impossible, as the defined algorithms include those that are cheating. An algorithm may just call for N - 1 comparisons producing a straight sequence of subsequent numbers producing the trivial graph min -> p(2) -> ... ->p(n-1) -> max.

    Therefore we need to modify the setup. We do not preallocate the numbers, but arrange them in a consistent way while requested from the algorithm. E.g. if we are first asked for comparing a < b, then for c < d and afterwards for b < c, we are free to arrange these such that b > c.

    Then we need to prove that an algorithm under such dynamic conditions is not capable of:

    • Finding the solution with S < 3 N/2 - 2 comparisons in every attempt (strict). Or
    • Finding the solution with an average number of S < 3 N/2 - 2 comparisons (relaxed).

    Maybe my expectations are too high, but that is the kind of proof I had expected and I don't feel capable to provide.

    Greetings,
    🐻

    $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$
      > In the three comparisons you propose between the solutions for C and C' you assume there must be a comparison between min and max.

      You mean statement (1) ?

      All algorithms must start somewhere, you just pick (Inf,-Inf) for the first pair in algorithm A.¹ And since you already know the result, it's a pseudo comparison - you skip the step it in the constructed algorithm A'.

      Just imagine playing the game with C' 100 covered cards plus two extra dummy cards in your head and applying an algorithm A for C 102 cards.

      The dealer doesn't know what's in your head, he'll never hear a question about the dummy cards, because you already know the necessary results to continue your algorithm.

      Two things are relevant:

      • you got the min&max of C' in the end
      • you asked the dealer 3 questions less than you would have needed for 102 covered cards.

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      see Wikisyntax for the Monastery

      ¹) it doesn't matter if it was unlikely. What matters is that A is guaranteed to work with any starting pair.