This morning I came across this problem, see the below code. For a while, I obviously had the wrong impression, thought hash keys would keep its own "type". Now I realized that Perl actually stringify the hash keys, so when I do a sort keys, it actually gave me the alphabetical order, not the numeric order I was expecting.

I am thinking whether it is more reasonable, to make the native order as the default order, unless:
  1. there is no native order for that type (for example, I have an OO class called Book, it world be nice if I have a way to define the native order for this class as sort by ISDN number, which is an attribute of the Book class), or
  2. keys are in different types (for example, some are numbers, some are strings)
then use the stringified order.

use Data::Dumper; use strict; my $hash = {}; for (1..100) {#keys are all numbers, from the users' point of view $hash->{$_} = "a"; } my @keys = keys %$hash;#this is the internal order, I don't care print(join(",", @keys)); print "\n", "=" x 70, "\n"; my @sorted = sort @keys;#this will sort as strings!! print(join(",", @sorted)); print "\n", "=" x 70, "\n"; my @sorted_as_num = sort {$a <=> $b} @keys;#force it to sort as number print(join(",", @sorted_as_num)); print "\n", "=" x 70, "\n"; print Dumper($hash);

Replies are listed 'Best First'.
Re: sort hash keys as numbers and maybe even more...
by chromatic (Archbishop) on Feb 05, 2003 at 21:53 UTC

    Implementation wise, how do you do this? Do you start off with a numerical sort, then redo the sort partway through if you detect a string? Do you require that lists be marked as containing only numbers? How do you handle tied variables or objects that can numify and stringify at will?

    I don't think it's easy, and that's why the second sentence of perlfunc for sort says:

    If SUBNAME or BLOCK is omitted, "sort"s in standard string comparison order.
      I agree, it would not be easy. Not trying to argue about this, just thinking what the root cause is, other than that Perl does not support strong type.

      I am thinking the other important thing contributed to this problem, is that Perl is not really OO. Think in this way, in a OO language, like Java or c++, a list would always contain the virtually the same type of element, even though they might actually be quite different, but you can always use the sort defined on the weakest type, (in Perl, the worst would be go back to the Object class, which is the base for all).

      In a OO language, the realization of what is contained in a list, is obviously not a problem.

        Java allows you define arrays of type Object and put whatever you want into them -- good luck sorting those.
        Object fruit[] = new Object[2]; fruit[0] = new Apple(); fruit[1] = new Orange();
        I am thinking the other important thing contributed to this problem, is that Perl is not really OO.

        Surely that's a typing issue rather than an OO issue? (e.g. Eiffel & CLOS allow lists/collections of any type and are both "OO". ML types lists and isn't.)

      Hmm, perhaps if the value used as a key has a 'hash' property, it's used as the internal hashvalue; by default it stringifies and uses the built-in hash algorithm on the string.

      What order to sort keys in when listing them? Well, each and keys doesn't have any kind of order to it in the first place. If you want an order, use sort. The block argument to sort specifies the algorithm to use. The default ordering is string cmp, and that has nothing to do with hashes. That's just what you get if you sort keys.

      —John

Re: sort hash keys as numbers and maybe even more...
by Abigail-II (Bishop) on Feb 05, 2003 at 23:55 UTC
    The fact the scalars are used as hash keys is a red herring. It's completely irrelevant. If you sort, without a sort function, Perl will always sort in string order. If you do:
    perl -wle 'print for sort 1 .. 100'

    the first lines of output are:

    1 10 100 11 12

    Abigail

Re: sort hash keys as numbers and maybe even more...
by Juerd (Abbot) on Feb 06, 2003 at 18:44 UTC

    In Perl 5, hash keys are strings. Numbers, references, undef and globs are stringified when used as a hash key. Tie::RefHash solves some problems.

    This all has nothing to do with how hash keys are sorted. sort doesn't care or even know the list you give it is a list of hash keys. It sorts the same way it always does: using the code you supply or { $a cmp $b } if you didn't define a method of sorting.

    Juerd
    - http://juerd.nl/
    - spamcollector_perlmonks@juerd.nl (do not use).
    

Re: sort hash keys as numbers and maybe even more...
by demerphq (Chancellor) on Feb 08, 2003 at 23:41 UTC
    You may find the ideas and code in (Golf) Keysort useful.

    --- demerphq
    my friends call me, usually because I'm late....

Re: sort hash keys as numbers and maybe even more...
by John M. Dlugosz (Monsignor) on Feb 07, 2003 at 20:43 UTC
    there is no native order for that type (for example, I have an OO class called Book, it world be nice if I have a way to define the native order for this class as sort by ISDN number, which is an attribute of the Book class)

    Sure, that works now, I should think. Overload cmp for the Book class, and sort an array of refs to Books. It should call your comparison code for each pair of Books it needs to figure out the order, without having to say so in the sort call.

    —John

Re: sort hash keys as numbers and maybe even more...
by steves (Curate) on Feb 08, 2003 at 23:59 UTC

    One approach to redefining sort order is to not use the sort function explicitly, but instead to tie the hash up front so that keys are maintained in a specific order and the contract with the rest of the code is to just use the keys function to get the right order. A lot of extra overhead maybe, but something to consider.

    Approach #2, which I have used to a degree, is to object-ify any hashes of this nature (or this sort even 8-) and have each provide it's own sort method. Since object overhead is relatively low in Perl this one usually works out fairly well in my experience. Still a lot of extra work if it's for localized use.

    Whenver I see a suggestion to change a base language characteristic my mind sends out a "wrapper" alert.