http://qs1969.pair.com?node_id=510525


in reply to Puzzle: The Ham Cheese Sandwich cut.

As ambrus said, even the warm-up problem is pretty damned hard. I too cheated by looking in Cormen, Leiserson and Rivest. Here is a Perl implementation of the linear-time algorithm they give.
sub naive_median { (sort {$a <=> $b} @_)[@_/2]; } sub nth_largest { my ($n, @a) = @_; die "You can't find the ${n}th-largest element of an ".@a."-element +array!" if $n > $#a || $n < 0; #warn "Looking for ${n}th element of (@a)\n"; return $a[0] if $n == 0; my @medians; for(my $i=0; $i < @a; $i += 5) { push @medians, naive_median(@a[$i..($i+4 > $#a ? $#a : $i+4)]); } my $median = median(@medians); my @smaller = grep {$_ < $median} @a; return nth_largest($n, @smaller) if $n < @smaller; my @larger = grep {$_ >= $median} @a; return nth_largest($n - @smaller, @larger); } sub median { unshift @_, int(@_/2); goto &nth_largest; }
In practice it's pretty inefficient, and even proving that it runs in linear time is not entirely trivial!

Replies are listed 'Best First'.
Re^2: Puzzle: The Ham Cheese Sandwich cut.
by Roy Johnson (Monsignor) on Nov 21, 2005 at 19:17 UTC
    That goto is not helpful.

    Caution: Contents may have been coded under pressure.
      Hmm, that's interesting. But I bet the goto is faster if @_ has, say a million elements. You're saving an awful lot of copying. Update: I lost this bet :-)

      It also saves a fair amount of stack space.

        It isn't the goto that saves the copying. It's the use of & without parentheses to call the function. As far as I can tell, the goto is a useless relic.

        Update: Ok, not entirely useless: if you want a routine not to return to where you called it from, goto is the way to do it. In other words, use it when you want to do weird control flow (which is what goto is all about). Example:

        sub one { print "In one\n"; goto &two; print "never print this if you goto\n"; } sub two { print "In two\n"; } one();
        It's like exec for the subroutine domain.

        Caution: Contents may have been coded under pressure.
Re^2: Puzzle: The Ham Cheese Sandwich cut.
by hv (Prior) on Nov 22, 2005 at 13:19 UTC

    I suspect that a proof of the running time order will concentrate on the expected depth of recursion.

    However I believe it will be much harder to prove that the push is O(1) - indeed I suspect it is not - and without that the algorithm as a whole cannot be O(n).

    Hugo

      No, the proof doesn't need an expected running time. The running time T(N) is expressed as:
      T(N) = T(N/5) + T(7N/10 + 10) + Ο(N);
      which has T(N) = Ο(N) as a solution.
      However I believe it will be much harder to prove that the push is O(1) - indeed I suspect it is not - and without that the algorithm as a whole cannot be O(n).
      It doesn't have to be. What's needed is that the push has an amortized running time of Ο(1) - that is, if we perform N pushes, the total running time is still bounded by Ο(N). And from what I understand of how allocation of array sizes work (an addition extra 20% memory is being claimed), a push has an amortized Ο(1) performance. A single push may take Θ(N) running time, but N pushes average it out.
      Perl --((8:>*