Evanovich has asked for the wisdom of the Perl Monks concerning the following question:

Dearest monks, I have an array of hashes which I call @database. The keys for all the hashes are identical--they all have the same number of keys and the keys have the same names. I want to take dot products (and from that angles) of the first hash with the other hashes using PDL. Here is what I have:
my $pdl1 = pdl (values(%{$database[0]})); my $n1 = norm $pdl1; my ($pdl2, $n2, $dotproduct, $d, @angles); for (0..$3database) { $pdl2 = pdl (values (%{$database[$_]})); $n2 = norm $pdl2; $dotproduct = inner ($n1, $n2); $d = dotproduct->sclr(); $angles[$_] = (180/3.1415926)*acos($d); }
My problem is that the angles aren't right. could it be that perl is not sorting the values of each hash consistently? Is there a way to easily and efficiently correct for this? Yours humbly, Evan

Replies are listed 'Best First'.
Re: Lining up many hash values
by robartes (Priest) on May 02, 2003 at 06:44 UTC
    Dotproducts and angles bring back scary memories from my uni days, so I'm not going to go there :)

    pzbagel was right in saying that the ordering of hashes in Perl is not deterministic: nothing guarantees that you always get stuff out in the same order as you put them in (in fact, given the nature of the hashing algorithm, you are virtually guaranteed that you get them out in a different order than they went in). If you want to guarantee a consistent order (though not necessarily the initial order), use sort on the list returned by keys %hash.

    However, this seems to be more of a case for arrays than hashes. As you say, all your hashes have identical keys, both in number and name. If you want to associate some kind of descriptive name to the array elements, keep a seperate array of names. Something like this:

    my @names=qw/value1 value2 value3/; my @array1=(1, 2, 3); my @array2=(4, 5, 6);
    Or even use an AoA to store your actual data:
    my @AoA; push @AoA, [ 1,2,3 ]; push @AoA, [ 4,5,6 ]; #Then use them as such: $AoA[0][0]; # 1st element of first array $AoA[1][0]; # 1st element of second array

    CU
    Robartes-

Re: Lining up many hash values
by pzbagel (Chaplain) on May 02, 2003 at 05:48 UTC

    As is noted in the perl documentation regarding hashes, the order that hash values will come out of a hash is not guaranteed. You should use keys() and then iterate over the keys (getting the value from each hash) to do your dotproducts. This will guarantee you are operating on the correct pairs of values.

    Cheers

Re: Lining up many hash values
by BrowserUk (Patriarch) on May 02, 2003 at 07:30 UTC

    I agree with robartes, using hashes for this application is basically the wrong way to do it. It would seem that the only reason you are using the hash is to give meaningful names to the values. An alternative is to use constant names for the array elements via the constant pragma.

    However, its not clear from your snippet whether you are generating the hashes (in which case you could re-work the hashes to arrays) or whether these are being generated by a module which would make life harder.

    You might also look at Tie::IxHash which will always return the values in the order the keys were created, but there is a performance penalty associated with this which, if the hashes are small might outweight the benefits of avoiding sorting. And again, if the hashes are being generated in a module, it could be awkward to persuade it/them to use Tie::IxHash.


    Examine what is said, not who speaks.
    1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
    2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
    3) Any sufficiently advanced technology is indistinguishable from magic.
    Arthur C. Clarke.
Re: Lining up many hash values
by jsegal (Friar) on May 02, 2003 at 16:25 UTC
    Others have noted that perl does not guarantee the order of hash elements. There is, though a simpler way to get what you want, without changing your data structures: use hash slices to extract the values from the hash, instead of using the values function.

    e.g:
    my @keys = keys %{$database[0]}; #optionally use sort keys my $pdl1 = pdl (@{$database[0]}{@keys}); my $n1 = norm $pdl1; my ($pdl2, $n2, $dotproduct, $d, @angles); for (0..$3database) { $pdl2 = pdl ((@{$database[$_]}{@keys}); $n2 = norm $pdl2; $dotproduct = inner ($n1, $n2); $d = dotproduct->sclr(); $angles[$_] = (180/3.1415926)*acos($d); }
    A hash slice allows you to extract many elements of a hash at a time, returning an array, in the order of the array you passed in. By passing the same keys array to each hash, you guarantee they will come out in the same order. If you want to force the order, you can sort the @keys array however you deem appropriate...

    The syntax of hash slices is a little tricky, especially when dealing with references, but you can piece it apart, or what I did is "build it up" from first principles, starting from  @hash{@keys} and going from there to  @$hashref{@keys} to @{$arrayofhashrefs[$arrayindex]}{@keys}.

    Slices (of both hashes an arrays) can give incredible expressive power. They also are more efficient than looping yourself, since the looping is done in compiled C by the perl engine, rather than in interpreted code...

    Hope this helps.

    --JAS