in reply to Re: RFC - inplace upgrade for Tie::SortHash
in thread RFC - inplace upgrade for Tie::SortHash

demerphq,
I really appreciated your comments. I haven't touched this for almost a month. I was waiting on the module author to return my emails, which did not happen. Today I decided to start over. I would like to address the points you made on the previous code:

  • I cant help but wonder if your use of autovivification to create the various elements of the object is dangerous
    The code creates a new hash from an old hash. It is only dangerous to the extent that I trusted that what I was being passed was a hash reference without testing it.

  • Why arent you using warnings?!
    The warnings were not turned on because the original module author did not use them.

  • Where did the 'a' key go?
    Code I added to optimize the iteration created an array with the hash keys in order. I figured another optimization would be to delete the array element without going through the sort routine again. The problem was that I didn't first verify that the key being deleted actually existed. The 'a' disapeared because it happened to be the 0 index - the result of an undefined variable

  • Also delete() doesnt set the CHANGED flag, and will not play nicely with keys. (It should always be possible to delete the last visited key while iterating the hash without disrupting the hash.)
    Correct - it doesn't set the changed flag. It accomplishes the same thing by deleting the array element described above. I do not believe there would be any disruption with deleting a key while iterating other than your discovery of deleting a key that doesn't exist.

  • If you intend to set these two to undef then I can think of more obvious ways. Otherwise this is a bug.
    defined $self->[ARRAY][$index] ? $self->[ARRAY][$index] : undef; What exactly is the purpose of this line? Isnt the conditional totally + redundant?

    The code is correct as written. The conditional is designed to detect when you have fallen off the end of the array and return undef. I am not sure what is more obvious in undef'ing the arrays then what I used?

  • I think you need to revisit the code and build up a big set of tests. I have a feeling there are other whammies in there as well.
    I indeed did find lots of whammies today when I went back over the code. A few I introduced when I tried to "fix", in my opinion, very bad code. The rest were already there.

  • Another question I have is why the iterator requires a hash lookup each time?
    I use an array with the hash keys in order. The problem is that I am only aware of the last key. I use the hash lookup to get the index of that element in the array. Getting the next key is then a simple matter of adding 1.

    Later, when I have time to build up a test suite and finish the POD, I will re-post an RFC that outlines my questions and concerns. In the interim, here is the new non-backwards compatible code.

    #!/usr/bin/perl -w package Tie::SortedHash; use strict; use Carp; use constant HASH => 0; use constant LOOKUP => 1; use constant ARRAY => 2; use constant SORT => 3; use constant CHANGED => 4; use constant OPT => 5; our $VERSION = '1.00'; sub TIEHASH { my $class = shift; croak "Incorrect number of parameters" if @_ % 2; my %options = @_; my $self = bless [], $class; $self->_Build(\%options); return $self; } sub FETCH { my($self, $key) = @_; $self->[HASH]{$key}; } sub STORE { my($self, $key, $value) = @_; $self->[HASH]{$key} = $value; $self->[CHANGED] = 1; } sub EXISTS { my($self, $key) = @_; exists $self->[HASH]{$key}; } sub DELETE { my($self, $key) = @_; delete $self->[HASH]{$key}; if ($self->[OPT] == 2 && exists $self->[LOOKUP]{$key}) { splice(@{$self->[ARRAY]}, $self->[LOOKUP]{$key}, 1); delete $self->[LOOKUP]{$key}; } } sub FIRSTKEY { my $self = shift; $self->_ReOrder if $self->[OPT] == 1 || ($self->[OPT] == 2 && $sel +f->[CHANGED]); $self->_Iterate; } sub NEXTKEY { my ($self, $lastkey) = @_; $self->_Iterate($lastkey); } sub CLEAR { my $self = shift; $self->[HASH] = {}; $self->[CHANGED] = 1; } sub DESTROY { } sub _Build { my ($self, $opt) = @_; my $sort = exists $opt->{SORT} ? $opt->{SORT} : sub { my $hash = shift; sort {$a cmp $b || $a <=> $b} keys %$hash; }; $self->sortroutine($sort); my $hash = exists $opt->{HASH} ? $opt->{HASH} : {}; croak "$hash is not a hash ref" unless ref $hash eq 'HASH'; @{$self->[HASH]}{keys %$hash} = values %$hash; if (exists $opt->{OPTLEVEL}) { croak "$opt->{OPTLEVEL} is not valid optimization level" if $o +pt->{OPTLEVEL} !~ /^[123]$/; $self->[OPT] = $opt->{OPTLEVEL}; } else { $self->[OPT] = 1; } } sub _ReOrder { my $self = shift; $self->[LOOKUP] = (); $self->[ARRAY] = (); my $index = 0; for my $key ($self->[SORT]($self->[HASH])) { $self->[LOOKUP]{$key} = $index; $self->[ARRAY][$index] = $key; $index++; } $self->[CHANGED] = 0; } sub _Iterate { my ($self, $lastkey) = @_; unless ($self->[OPT] == 3) { my $index = defined $lastkey ? $self->[LOOKUP]{$lastke +y} : -1; $index++; return defined $self->[ARRAY][$index] ? $self->[ARRAY][$index] + : undef; } else { my $match; for my $key ($self->[SORT]($self->[HASH])) { + return $key if $match || ! defined $lastkey; $match = 1 if $key eq $lastkey; } } return undef; } sub sortroutine { my($self, $sort) = @_; croak "$sort is not a code ref" unless ref $sort eq 'CODE'; $self->[SORT] = $sort; $self->[CHANGED] = 1; } 1; __END__ =head1 NAME Tie::HashSort - Perl module to get hash keys in a sorted order =head1 SYNOPSIS use Tie::HashSort; my %hash = ( 'John' => 33, 'Jacob' => 29, 'Jingle' => 15, 'Heimer' => 48, 'Smitz' => 12, ); my $sort = sub { my $hash = shift; sort {$hash->{$b} <=> $hash->{$a}} keys %$hash; }; tie my %sorted_hash, 'Tie::SortedHash', 'HASH' => \%hash, 'SORT' = +> $sort, 'OPTLEVEL' => 2; for my $name ( keys %sorted_hash ) { print "$name is $hash{$name} ears old.\n"; } ### OUTPUT ### Heimer is 48 ears old. John is 33 ears old. Jacob is 29 ears old. Jingle is 15 ears old. Smitz is 12 ears old. =head1 DESCRIPTION This module is a designed to retrieve hash keys in a pre-defined sorte +d order. It is often frustrating to have a hash return elements in a seemingly +random order when using C<keys()>, C<values()> or C<each()>.
    Cheers - L~R
  • Replies are listed 'Best First'.
    Re: Re: Re: RFC - inplace upgrade for Tie::SortHash
    by demerphq (Chancellor) on Sep 05, 2003 at 16:56 UTC

      Today I decided to start over. I would like to address the points you made on the previous code:

      Hmm, im not up for responding completely yet, so you can expect a /msg that this reply has been updated. However, there is one point id like to address:

      The code is correct as written. ...

      defined $self->[ARRAY][$index] ? $self->[ARRAY][$index] : undef;

      Except that its misleading as hell. It is _exactly_ equivelent to

      $self->[ARRAY][$index]

      Except that it makes a reader wonder what magic specialness is going on, and is marginally slower. If nothing special is happening then it should be changed to the latter IMO


      ---
      demerphq

      <Elian> And I do take a kind of perverse pleasure in having an OO assembly language...
        demerphq,
        I have to disagree.
        defined $self->[ARRAY][$index] ? $self->[ARRAY][$index] : undef;
        Is equivalent to:
        if (defined $self->[ARRAY][$index]) { return $self->[ARRAY][$index]; } else { return undef; }
        Cheers - L~R

          Yes, and its exactly equivelent to

          return $self->[ARRAY][$index];

          What is the point of the if in your code? If the values of $self->[ARRAY][$index] is defined then we return the value. If its not then we return undef. Whats the point? The value of $self->[ARRAY][$index] should simply be returned. There is no difference. (Except the one is more efficient and less confusing.)

          Consider this using different values:

          sub foo { my $x=shift; if ($x=10) { return 10; } else { return $x; } }

          Which is just plain silly isn't it?


          ---
          demerphq

          <Elian> And I do take a kind of perverse pleasure in having an OO assembly language...