http://qs1969.pair.com?node_id=557616

Hue-Bond has asked for the wisdom of the Perl Monks concerning the following question:

```\$ perldoc -f each
each HASH
Entries are returned in an apparently random order. [...] Since
Perl 5.8.1 the ordering is different even between different
runs of Perl for security reasons (see "Algorithmic Complexity
Attacks" in perlsec).

Great feature, so let's try it!

```\$ PERL_HASH_SEED_DEBUG=1 perl -e 'my %a=0..299; print \$_ for each %a'
+| md5sum
HASH_SEED = 3379561142
b6d67a24906e8a8541291882f81d31ca  -
\$ PERL_HASH_SEED_DEBUG=1 perl -e 'my %a=0..299; print \$_ for each %a'
+| md5sum
HASH_SEED = 4068799219
b6d67a24906e8a8541291882f81d31ca  -

Hmm, same MD5? And what about keys and values?

```\$ PERL_HASH_SEED_DEBUG=1 perl -e 'my %a=0..299; print \$_ for keys %a'
+| md5sum
HASH_SEED = 1151419008
dd56233cf84603df0d47d272da1af003  -
\$ PERL_HASH_SEED_DEBUG=1 perl -e 'my %a=0..299; print \$_ for keys %a'
+| md5sum
HASH_SEED = 2731377116
dd56233cf84603df0d47d272da1af003  -

\$ PERL_HASH_SEED_DEBUG=1 perl -e 'my %a=0..299; print \$_ for values %a
+' | md5sum
HASH_SEED = 1861095523
c1a75ab8e3bf1ff6c07b01025a1219e9  -
\$ PERL_HASH_SEED_DEBUG=1 perl -e 'my %a=0..299; print \$_ for values %a
+' | md5sum
HASH_SEED = 788024661
c1a75ab8e3bf1ff6c07b01025a1219e9  -

This implies I'm always getting the same output, so each, keys and values are returning the elements in the same order even when the seeds are different! My Perl was compiled without -DUSE_HASH_SEED_EXPLICIT (according to \$Config::Config{ccflags}) so I shouldn't be setting the seed manually but anyway I tried, just in case. I used bash's \$RANDOM variable, that returns a different value each time it's evaluated:

```\$ echo \$RANDOM
8035
\$ echo \$RANDOM
797
\$ PERL_HASH_SEED=\$RANDOM PERL_HASH_SEED_DEBUG=1 perl -e 'my %a=0..299;
+ print \$_ for values %a' | md5sum
HASH_SEED = 26815
c1a75ab8e3bf1ff6c07b01025a1219e9  -
\$ PERL_HASH_SEED=\$RANDOM PERL_HASH_SEED_DEBUG=1 perl -e 'my %a=0..299;
+ print \$_ for values %a' | md5sum
HASH_SEED = 30449
c1a75ab8e3bf1ff6c07b01025a1219e9  -

Super Search found this but it doesn't seem to apply here since perl -V | grep SEED shows nothing. The test here also returns the same output every time I run it. This is a 5.8.8 running on Debian/Linux. Ideas?

--
David Serrano

Replies are listed 'Best First'.
Re: Hashes aren't being differently randomized
by vkon (Curate) on Jun 26, 2006 at 17:58 UTC
In my opinion, hash randomization issue do not mean that you're guaranteed different keys order on different runs (but sometimes you will), it means something different.

algorithm complexity attack only means you'll not get hash slowness because all hash items felt into same bucket so extracting hash element will be O(N), instead of O(log(N))

However the moral is - you must not pretend that keys order is same between runs (so some programs probably stopped working when hash randomization was introduced)

This is explicitly stated in perlsec:
Perl has never guaranteed any ordering of the hash keys, and the ordering has already changed several times during the lifetime of Perl 5. Also, the ordering of hash keys has always been, and continues to be, affected by the insertion order.

Also note that while the order of the hash elements might be randomised, this "pseudoordering" should not be used for applications like shuffling a list randomly (use List::Util::shuffle() for that, see List::Util, a standard core module since Perl 5.8.0; or the CPAN module Algorithm::Numerical::Shuffle), or for generating permutations (use e.g. the CPAN modules Algorithm::Permute or Algorithm::FastPermute), or for any cryptographic applications.

greets,
--shmem

```_(\$_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
/\_¯/(q    /
----------------------------  \__(m.====·.(_("always off the crowd"))."·
");sub _{s./.(\$e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;\$e.e && print}```
I suspect you intended to reply to OP, isn't it?

And yet, it seems very unlikely that effectively changing the hash function would result in identical output order. The output order is based on the hashed key values modulo the number of buckets, so changing the hash function is very likely to change the output order.

So test the effectiveness of the randomization directly: See which hash keys collide. If the same hash keys collide between runs, then the randomization isn't doing what it is supposed to:

```my %hash;
my \$key= 'aa';
my \$count= @ARGV ? shift(@ARGV) : 20;
\$hash{\$key}= 'base';
while( 1 ) {
\$hash{++\$key}= 'test';
if( %hash =~ /^1/ ) {
print "Collision: \$key\n";
exit   if  --\$count <= 0;
}
delete \$hash{\$key};
}

I always get the same output, but this PERL_HASH_SEED_DEBUG environment variable doesn't work for me so I won't jump to any conclusions (my 5.008_007 is new enough that it should include it, however).

- tye

I always get the same output

Same here:

```\$ PERL_HASH_SEED_DEBUG=1 perl ./foo | md5sum
HASH_SEED = 4155558892
\$ PERL_HASH_SEED_DEBUG=1 perl ./foo | md5sum
HASH_SEED = 51798468
\$ PERL_HASH_SEED_DEBUG=1 perl ./foo | md5sum
HASH_SEED = 2930542944

--
David Serrano

"perl -V" reports " ... ccflags =' ... -DNO_HASH_SEED ... ", which likely explains why /I/ always get the same results. My Perl was built by Randy Kobes. I'm a bit curious why that setting was chosen.

- tye

Re: Hashes aren't being differently randomized (explanation)
by demerphq (Chancellor) on Jun 28, 2006 at 07:58 UTC

Reviewing the docs and the source I think I can explain what the deal is. Hashes arent "randomized" at all. In fact from what I can tell all perls default to using the exact same hashing at the beginning. But when a hash goes pathological (which appears from the hash testing code to be defined as having twenty keys in a single bucket) then randomization kicks in, on a per hash basis.

I dont know if this behaviour makes sense, or if it matches with how its documented, but that appears to be the way it works.

The code below shows how you can put 10 keys in a single bucket but that twenty will cause the hash to randomize itself. I had to change tyes routine a little as the strings we are dealing with are all zero bytes (which effectively guarantees that they go in the same bucket in a non randomized hash).

I guess you could use this technique to randomize a hash. Stuff it with "\0" x 1 to "\0" x 20 and then empty it and then refill it with what you really want to store in it.

```my %hash= map { "\0" x \$_ => \$_ }1..10;
print "Ten keys:\n";
for my \$av (  getKeyCollisions_Len( \%hash )  ) {
print "@\$av\n";
}
print "Twenty keys:\n";
%hash= map { "\0" x \$_ => \$_ }1..20;
for my \$av (  getKeyCollisions_Len( \%hash )  ) {
print "@\$av\n";
}
sub getKeyCollisions_Len
{
my( \$hv )= @_;
no warnings 'numeric';
my \$buckets= 0 + %\$hv;
my @keys= keys %\$hv;
my( @clash, @return );
while(  @keys  ) {
my \$key= shift @keys;
delete \$hv->{\$key};
my \$b= 0 + %\$hv;
if(  \$b == \$buckets  ) {
push @clash, length \$key;
} elsif(  @clash  ) {
push @return, [@clash,length \$key];
@clash= ();
}
\$buckets= \$b;
}
return @return;
}
__END__
Ten keys:
10 9 8 7 6 5 4 3 2 1
Twenty keys:
17 12
4 5
9 14
---
\$world=~s/war/peace/g

Re: Hashes aren't being differently randomized
by demerphq (Chancellor) on Jun 27, 2006 at 21:56 UTC

I think you need a debugging perl for the environment var to work. But you can get the same effect from Hash::Util::hash_seed(), which returns 0 if hash seed randomization is NOT in effect. From what i can tell Activestates builds disable this feature.

---
\$world=~s/war/peace/g

I think you need a debugging perl for the environment var to work. [...] you can get the same effect from Hash::Util::hash_seed()

Both environment variables shown in the OP seem to be working:

```\$ PERL_HASH_SEED_DEBUG=1 perl -MHash::Util=hash_seed -le'print hash_se
+ed'
HASH_SEED = 165032588
165032588
\$ PERL_HASH_SEED_DEBUG=1 perl -MHash::Util=hash_seed -le'print hash_se
+ed'
HASH_SEED = 725441538
725441538

tye's getKeyCollisions shows always the same output here:

```\$ PERL_HASH_SEED_DEBUG=1 perl gkc.pl | md5sum
HASH_SEED = 501557310
e4d6401f730eede592733482fac0ec61  -
\$ PERL_HASH_SEED_DEBUG=1 perl gkc.pl | md5sum
HASH_SEED = 1324598606
e4d6401f730eede592733482fac0ec61  -