in reply to Bitwise & stopped working?

If you have a 32-bit perl, it can very well be that $hash * 65536 results into a value that cannot be expressed in 32 bits. Perl will upgrade the variable to a floating point number; $hash * 65536 can then still be expressed without loss of precision. However, '&' is a bitwise operator; if one of its operands is a floating point number, unexpected things may happen.

You may want to run with a perl that is build to work with 64 bit integers. Or perhaps a use integer; does the trick for you.

Replies are listed 'Best First'.
Re^2: Bitwise & stopped working?
by kennethk (Abbot) on May 12, 2010 at 16:50 UTC
    A simpler fix is just to swap to bit-shifts instead of multiplying by powers of two (Shift Operators):

    $hash = ($c + ($hash << 6) + ($hash << 16) - $hash) & 0xffffffff;

    $hash = ($c + (($hash & 0x03ffffff) << 6) + (($hash & 0x0000ffff) << 16) - $hash) & 0xffffffff;

    This has the advantage of making behavior independent of machine word length (assuming at least 32-bit, of course).

    Update: Modified as per ikegami's comment below. For some reason I thought Perl handled bit-shift overflow in a standardized fashion by dropping bits.

    Except the above is subject to arithmetic overflow once the addition stage is reached, with maximum error at $c=0xFFFFFFFF;$hash=0x0FFFFFFF;. Short of implementing a Full Adder, I think bit operations are not really reasonable. I give you the portable, overflow-immune hash function:

    sub ModifiedMakeHashString($) { my $string = shift; my $hash = 0; for my $c (map {ord} split //,$string) { my $six_shift = ($hash & 0x03ffffff) << 6; my $sixteen_shift = ($hash & 0x0000ffff) << 16; my $inverse = ~($hash-1) & 0xffffffff; $hash = floored_addition( floored_addition($c, $six_shift), floored_addition($sixteen_shift, $inverse)); #printf "c = %09x, 6s = %09x, 16s = %09x, inv = %09x, hash = % +09x\n", # $c, $six_shift, $sixteen_shift, $inverse, $hash; #printf "c = %d, hash = %d\n", $c, $hash; } return $hash; } sub floored_addition { my ($x,$y) = @_; for my $i (reverse (0 .. 31)) { my $check_mask = 1 << $i; if ($check_mask & $x & $y) { my $op_mask = ~(2**$i-1) & 0xffffffff; $x ^= $op_mask&$x; $y ^= $op_mask&$y; last; } last if $check_mask & ~$x & ~$y; } return $x + $y; }

    It runs about 10x slower than the original. Any obvious algorithmic improvements, other than the obvious?

      $c=00000000, $hash=03FFFFFF => $hash=FFFFFFFF 32-bit Perl $c=00000000, $hash=03FFFFFF => $hash=FBFEFFC1 64-bit Perl

      That doesn't fix the problem, even if you fix the two bugs you introduced. Quote perlop,

      The result of overflowing the range of the integers is undefined because it is undefined also in C.

      ($hash << 6) should be (($hash & 0x03ffffff) << 6)
      ($hash << 16) should be (($hash & 0x0000ffff) << 16)

      Update: My original counter example was bad. Fixed.