in reply to Returning the lowest key in a hash (or highest)

Here's a tied hash solution that caches min and max. It turns these lookups into an O(1) operation, but an O(n) penalty is paid on delete's of min and max keys. Of course, there's also a performance loss due to the tied interface, but we're CS people after all, and we don't concern ourselves with messy practicalities like that, right? =)

Yes indeed, this required considerably more code than I had anticipated...

update: Added tilly's optimization (doh!), so now delete's are overall an O(1) operation. Just don't go deleting the keys in sorted order :)
update2: Added dkubb's optimization to _min and _max routines... good call!

use strict; package MinMaxHash; sub HASH () { 0 }; sub MIN () { 1 }; sub MAX () { 2 }; sub new { my $class = shift; my $self = bless {}, $class; tie %$self, $class; $self; } sub minkey { my $self = shift; (tied %$self)->[MIN]; } sub maxkey { my $self = shift; (tied %$self)->[MAX]; } sub TIEHASH { bless [], shift; } sub STORE { my ($self, $key, $val) = @_; $self->[HASH]{$key} = $val; if (!defined $self->[MIN] or $key < $self->[MIN]) { $self->[MIN] = $key; } if (!defined $self->[MAX] or $key > $self->[MAX]) { $self->[MAX] = $key; } } sub FETCH { my ($self, $key) = @_; $self->[HASH]{$key}; } sub DELETE { my ($self, $key) = @_; delete $self->[HASH]{$key}; $self->[MIN] = _min(keys %{$self->[HASH]}) if $key == $self->[MIN]; $self->[MAX] = _max(keys %{$self->[HASH]}) if $key == $self->[MAX]; } sub CLEAR { my $self = shift; $self = []; } sub EXISTS { my ($self, $key) = @_; exists $self->[HASH]{$key}; } sub FIRSTKEY { my $self = shift; () = keys %{$self->[HASH]}; # reset iterator each %{$self->[HASH]}; } sub NEXTKEY { my $self = shift; each %{$self->[HASH]}; } # tilly^H^H^H^H^Hdkubbs's min & max subs... used only during DELETE's sub _max { my $max = shift; $max < $_ and $max = $_ for @_; $max; } sub _min { my $min = shift; $min > $_ and $min = $_ for @_; $min; } ### EXAMPLE ### package main; use Data::Dumper; my $h = new MinMaxHash; $h->{1} = 'foo'; $h->{2} = 'bar'; $h->{3} = 'baz'; $h->{10} = 'goop'; $h->{-1} = 'moo'; print "Minkey is ", $h->minkey, $/; print "Maxkey is ", $h->maxkey, $/; print Dumper $h;
   MeowChow                                   
               s aamecha.s a..a\u$&owag.print

Replies are listed 'Best First'.
Re (tilly) 2: Returning the lowest key in a hash (or highest)
by tilly (Archbishop) on Mar 25, 2001 at 05:55 UTC
    Tied interfaces do tend to take more code than you would at first expect, don't they?

    But a slight optimization note. In your DELETE method there is no need to recompute MAX and MIN every time. Just put a check for whether the deleted element is MAX or MIN and then compute them accordingly. This makes DELETE efficient most of the time. Well algorithmically efficient, as you note Perl's tied interface could be snappier...

(dkubb) Re: (3) Returning the lowest key in a hash (or highest)
by dkubb (Deacon) on Mar 25, 2001 at 07:22 UTC

    I have one more optimization for you MeowChow. Inside your _min() and _max() subroutines, if the test returns false, it reassigns the original value back to itself. You can get around a 25% speed increase in the algorithm if you only assign the $_ value to $max or $min if it passes the test.

    Here's a short benchmark that illustrates the difference better:

    #!/usr/bin/perl -w use Algorithm::Numerical::Shuffle qw(shuffle); use Benchmark qw(cmpthese); #Randomize the array contents my @array = shuffle 0..1000; cmpthese(-3, { max_dkubb1 => sub { max_dkubb1(@array) }, max_tilly => sub { max_tilly(@array) }, }); sub max_dkubb1 { my $max = shift; $max < $_ and $max = $_ for @_; return $max; } sub max_tilly { my $max = shift; $max = $max < $_ ? $_ : $max for @_; return $max; }

    On my system the max_dkubb1() routine runs just over 25% faster than the other.

      A lot of that is 'and' being a lot faster than ?:, rather than the multiple assigns.

      Update:  Wrong!  jcwren challenged me to benchmark it and almost all the time is in the assigns.

      p
        Cool!

        Update: darn