PenguinPwrdBox has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks. This is my first post. I'm quite new to Perl, and am only now starting to churn out useful things regularly. I do, though, have a question... I understand that the order of data in hashes is determined, and optimized by Perl. However, is that order static? What I mean to say is, if I created an array, that would order itself based on the order of the keys in the hash, would that order change the next time Perl sees fit? Or does it remain static until I change a key=value pair? And I have to say, being a member of several forums, I have never seen anything like this. It's very different, and very cool:) Thank you in advance. -Adam
  • Comment on Determining hash order for sorting array

Replies are listed 'Best First'.
Re: Determining hash order for sorting array
by davidrw (Prior) on Aug 21, 2005 at 15:01 UTC
    From perldoc -f keys section of the perlfunc manpage:
    The actual random order is subject to change in future versions of perl, but it is guaranteed to be the same order as either the "values" or "each" function produces (given that the hash has not been modified).
    So in short you cannot depend on the keys, values, or each functions to provide results in a pre-determined order. There are ways around this .. I believe (haven't actually used it personally) that Tie::Hash Tie::IxHash can (will?) preserve hash order for you. Also, you can always sort the results of keys/values calls. For example:
    my %h = ( ... ); foreach my $k ( sort { lc $a cmp lc $b } keys %h ){ # get keys sorte +d by lowercase of key ... } foreach my $k ( sort { lc $h{$a} cmp lc $h{$b} } keys %h ){ $ get key +s sorted by lowercase of value ... }
      Also, the iteration order can change drastically after any new key is added, since that key may have forced an internal rehash into a different number of internal bins for efficiency. Two identically-filled hashes might have different iteration orders because of attack-proofing in the hash function. And so on.

      In short, while the word "random" is not accurate, one should never depend on the iteration order of a hash.

      --
      [ e d @ h a l l e y . c c ]

      I believe (haven't actually used it personally) that Tie::Hash can (will?) preserve hash order for you.
      It doesn't. However, there's a few modules that serve to tie a hash so its keys order is the same as the insertion order: Tie::IxHash (Pure Perl) and Tie::Hash::Indexed (ditto, but XS).
Re: Determining hash order for sorting array
by davido (Cardinal) on Aug 21, 2005 at 15:38 UTC

    Honestly, though a hash's order may remain unchanged from one call to keys to the next, any design that relies on any notion of hash order is probably in need of a change, unless you're planning on using a tied hash with a well-defined behavior with regard to hash order.

    What are you actually trying to do, by the way? Sometimes if you can sort of describe the task at hand, a better solution (better than relying on hash element orders) will turn up in the ensuing discussion.


    Dave

      Here is essentially what I am doing... I have been charged with the task of unifying all local login information for about 300 Unix machines. This means UID's, GID's, etc... What I have done, is created a perl script, that will harvest all the lines in /etc/passwd (this will be adapted for shadow and group as well - it's only a matter of changing the source to the file handle) - and then map them using a key=>value pair of the username=>line_in_passwd. I then use a simple while loop to parse each value, split it into the actual data (sans :), and use a DBI call to place it into a MySQL DB, along with the current working hostname. I have also populated an array with the usernames, the idea being that if a key matches an entry in the array, the DB already contains the info for that user/host pair, and it should branch to then check the values of that user, to ensure nothing has changed. As of now, it does not branch, I have been simply playing with the logic to ensure that if the statement is true, it will. Here is a snippet of that logic block. Please excuse the syntax...for I am nowhere near as effective yet as I would like to be :)
      CHILD:while( (my $key, my $value) = each %passwd){ for(@existing_users){next CHILD if /$key/;} $value =~ /\w*:(.*):(\d*):(\d*):(.*):(.*):(.*)/; my $query = "INSERT INTO passwd (username, pwd_loc, uid, gid, +gecos, home_dir, shell, hostname) values('$key', '$1', '$2', '$3', '$ +4', '$5', '$6', '$hostname')"; my $sth = $dbh->prepare($query); $sth->execute() || die("Couldn't exec sth!"); }
      The next CHILD if statement will be replaced with a subroutine to check for differences if I can get this right. My thought, regarding the hash order, was to see if there may be a better way to scan that hash for an existing user entry in the DB. And just by the way - I know that you Monks are huge advocates of using .pm's. There may indeed be one out there that I could adapt for this, however, being as new as I am to both Perl, and programming in general, it's in my best interest to reinvent the wheel :) Thanks again for the help Monks.
        there may be a better way to scan that hash for an existing user entry in the DB

        Indeed - instead of storing your identified usernames in an array (list), create a hash of existing user names. Then you can do:

        if ($existing_users{$key} ) { ...do some stuff... }

        Where, instead of pushing the user names retrieved from your db onto a @existing_users array, you would have populated a %existing_users hash with 1's using the user id's as the keys

        Forget that fear of gravity,
        Get a little savagery in your life.