Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Re: Hashes aren't being differently randomized

by vkon (Curate)
on Jun 26, 2006 at 17:58 UTC ( #557624=note: print w/replies, xml ) Need Help??

in reply to Hashes aren't being differently randomized

In my opinion, hash randomization issue do not mean that you're guaranteed different keys order on different runs (but sometimes you will), it means something different.

algorithm complexity attack only means you'll not get hash slowness because all hash items felt into same bucket so extracting hash element will be O(N), instead of O(log(N))

However the moral is - you must not pretend that keys order is same between runs (so some programs probably stopped working when hash randomization was introduced)

  • Comment on Re: Hashes aren't being differently randomized

Replies are listed 'Best First'.
Re^2: Hashes aren't being differently randomized
by shmem (Chancellor) on Jun 26, 2006 at 18:15 UTC
    This is explicitly stated in perlsec:
    Perl has never guaranteed any ordering of the hash keys, and the ordering has already changed several times during the lifetime of Perl 5. Also, the ordering of hash keys has always been, and continues to be, affected by the insertion order.

    Also note that while the order of the hash elements might be randomised, this "pseudoordering" should not be used for applications like shuffling a list randomly (use List::Util::shuffle() for that, see List::Util, a standard core module since Perl 5.8.0; or the CPAN module Algorithm::Numerical::Shuffle), or for generating permutations (use e.g. the CPAN modules Algorithm::Permute or Algorithm::FastPermute), or for any cryptographic applications.


    _($_=" "x(1<<5)."?\n".qキ/)Oo.  Gー\        /
                                  /\_ッ/(q    /
    ----------------------------  \__(m.====キ.(_("always off the crowd"))."キ
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
      Your post is good and do not contradicts mine,
      I suspect you intended to reply to OP, isn't it?
        It was meant as an addendum.

        (If I posted to OP, it would have been out of sequence if anybody else would have posted in between. I wanted to elaborate deliberately on perl not guaranteeing neither order nor disorder, as follow-up to your statement you must not pretend ...)


        _($_=" "x(1<<5)."?\n".qテつキ/)Oo.  Gテつー\        /
                                      /\_テつッ/(q    /
        ----------------------------  \__(m.====テつキ.(_("always off the crowd"))."テつキ
        ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re^2: Hashes aren't being differently randomized
by tye (Sage) on Jun 27, 2006 at 06:39 UTC

    And yet, it seems very unlikely that effectively changing the hash function would result in identical output order. The output order is based on the hashed key values modulo the number of buckets, so changing the hash function is very likely to change the output order.

    So test the effectiveness of the randomization directly: See which hash keys collide. If the same hash keys collide between runs, then the randomization isn't doing what it is supposed to:

    my %hash; my $key= 'aa'; my $count= @ARGV ? shift(@ARGV) : 20; $hash{$key}= 'base'; while( 1 ) { $hash{++$key}= 'test'; if( %hash =~ /^1/ ) { print "Collision: $key\n"; exit if --$count <= 0; } delete $hash{$key}; }

    I always get the same output, but this PERL_HASH_SEED_DEBUG environment variable doesn't work for me so I won't jump to any conclusions (my 5.008_007 is new enough that it should include it, however).

    - tye        

      I always get the same output

      Same here:

      $ PERL_HASH_SEED_DEBUG=1 perl ./foo | md5sum HASH_SEED = 4155558892 da7b0b6c9adb2fbc01be283dc16aed19 - $ PERL_HASH_SEED_DEBUG=1 perl ./foo | md5sum HASH_SEED = 51798468 da7b0b6c9adb2fbc01be283dc16aed19 - $ PERL_HASH_SEED_DEBUG=1 perl ./foo | md5sum HASH_SEED = 2930542944 da7b0b6c9adb2fbc01be283dc16aed19 -

      David Serrano

        Ever wonder exactly which keys in your particular hash are sharing a bucket? Here is a destructive tool that will tell you that:

        my %hash= 0..299; for my $av ( getKeyCollisions( \%hash ) ) { print "@$av\n"; } sub getKeyCollisions { my( $hv )= @_; no warnings 'numeric'; my $buckets= 0 + %$hv; my @keys= keys %$hv; my( @clash, @return ); while( @keys ) { my $key= shift @keys; delete $hv->{$key}; my $b= 0 + %$hv; if( $b == $buckets ) { push @clash, $key; } elsif( @clash ) { push @return, [@clash,$key]; @clash= (); } $buckets= $b; } return @return; } __END__ 276 90 206 118 272 190 298 194 226 58 284 86 76 82 110 280 228 268 218 202 168 184 14 178 24 104 124 256 216 26 80 72 162 74 240 246 108 230 64 210 68 188 116 88 136 30 144 286 120 28 40 134 156 192 250 22 42 158 94 248 296 34 132 164

        - tye        

      "perl -V" reports " ... ccflags =' ... -DNO_HASH_SEED ... ", which likely explains why /I/ always get the same results. My Perl was built by Randy Kobes. I'm a bit curious why that setting was chosen.

      - tye        

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://557624]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (7)
As of 2023-02-02 14:53 GMT
Find Nodes?
    Voting Booth?
    I prefer not to run the latest version of Perl because:

    Results (19 votes). Check out past polls.