Re^2: how are the number of buckets in a perl hash chosen

If i did understand exactly, from your example, that the power of 2 should be a prime.
I tried it on different examples and saw that this is not correct every time.

perl  -E "%c = (3=>5);keys(%c) = 33;say  scalar(%c)"
1/64
[download]

Maybe i did not understand it 100%.
I looked at perldata and keys and saw indeed that the number of buckets is a power of 2 but it was not explained why.

Comment on Re^2: how are the number of buckets in a perl hash chosen Download Code

Replies are listed 'Best First'.
Re^3: how are the number of buckets in a perl hash chosen by Anonymous Monk on Jun 09, 2014 at 15:17 UTC
If i did understand exactly, from your example, that the power of 2 should be a prime. I don't know, wolfram says "prime factorization"... I looked at perldata and keys and saw indeed that the number of buckets is a power of 2 but it was not explained why. Oh, why? because :) its coded that way :D `newsize &= ~(newsize & (1 + ~newsize)); /* get proper power of 2 / .. Newxz(a, PERL_HV_ARRAY_ALLOC_BYTES(newsize), char);` [download] Programmers just love to do things in powers of two perl hash presize why power of 2 How Hashes Really Work - Perl.com In principle, a hashing function returns an array index directly; in practice, it is common to use its (arbitrary) return value modulo the number of buckets as the actual index. (Using a prime number of buckets that is not too close to a power of two tends to produce a sufficiently uniform key distribution.) A Hash Function for Hash Table Lookup The best hash table sizes are powers of 2. There is no need to do mod a prime (mod is sooo slow!). Hash table In the case that the array size is a power of two, the remainder operation is reduced to masking, which improves speed*, but can increase problems with a poor hash function. So there you have it, power of two, its for the speed	[reply] [d/l]

Replies are listed 'Best First'.

Re^3: how are the number of buckets in a perl hash chosen
by Anonymous Monk on Jun 09, 2014 at 15:17 UTC

If i did understand exactly, from your example, that the power of 2 should be a prime.

I don't know, wolfram says "prime factorization"...

I looked at perldata and keys and saw indeed that the number of buckets is a power of 2 but it was not explained why.

Oh, why? because :) its coded that way :D

newsize &= ~(newsize & (1 + ~newsize));    /* get proper power of 2 */
..
Newxz(a, PERL_HV_ARRAY_ALLOC_BYTES(newsize), char);
[download]

Programmers just love to do things in powers of two

perl hash presize why power of 2 How Hashes Really Work - Perl.com

In principle, a hashing function returns an array index directly; in practice, it is common to use its (arbitrary) return value modulo the number of buckets as the actual index. (Using a prime number of buckets that is not too close to a power of two tends to produce a sufficiently uniform key distribution.)

A Hash Function for Hash Table Lookup

The best hash table sizes are powers of 2. There is no need to do mod a prime (mod is sooo slow!).

Hash table

In the case that the array size is a power of two, the remainder operation is reduced to masking, which improves speed, but can increase problems with a poor hash function.

So there you have it, power of two, its for the speed

[reply]
[d/l]