Krokus has asked for the wisdom of the Perl Monks concerning the following question:

This is driving me mad. I have a perl script that has been working for several months but the other day it broke (the script hasn't been modified since I first wrote it). At first I thought, "well, it must be a result of something I did elsewhere that's mucking up the input to the script", but when I looked closer I realized that a bitwise & was now failing to work properly.

In the script I have a function that hashes a string to a 32-bit value. It looks like this:

sub MakeHashString($) { my $string = shift; my $hash = 0; my $i = 0; my $c = 0; for ($i = 0; $i < length($string); ++$i) { $c = ord(substr($string, $i, 1)); $hash = ($c + ($hash * 64) + ($hash * 65536) - $hash) & 0xfffff +fff; } return $hash; }

I added prints to the function to isolate the problem. When this function gets called with, for example, the string "accept", here is what $c and $hash are at the end of each loop iteration, showing $hash before and after the "& 0xffffffff":

c = 97, hash = 97 masked hash = 97 c = 99, hash = 6363202 masked hash = 6363202 c = 99, hash = 417419688097 masked hash = 4294967295 c = 101, hash = 281745559584806 masked hash = 4294967295 c = 112, hash = 281745559584817 masked hash = 4294967295 c = 116, hash = 281745559584821 masked hash = 4294967295

After the third iteration, $hash exceeds 0xffffffff and is forevermore relegated to being masked to a result of 0xffffffff. It doesn't matter if I mask it with the base 10 literal instead (i.e., 4294967295).

As I said, this script has worked fine for ages. My only thought was that perhaps my system path got messed up and a different version of perl is running. The version of perl I'm running which produces this issue is:

This is perl, v5.8.8 built for MSWin32-x86-multi-thread (with 50 registered patches, see perl -V for more detail) Copyright 1987-2006, Larry Wall

Anyone have an idea as to why bitwise & would fail like this?

Update: The perl version is ActiveState's build.

Replies are listed 'Best First'.
Re: Bitwise & stopped working?
by ikegami (Patriarch) on May 12, 2010 at 16:51 UTC

    Bitwise And only works on integers stored as integers (or on strings), not on floats. You're generating a number that doesn't fit in an integer, so Perl switches to using floats.

    On a 32-bit Perl, doing & 0xffffffff will do anything at best, produce garbage at worse.

    If you want to rely on the fact that Perl actually gives you at least 52 bits of precision for unsigned numbers through the use of floats, you could use division as an alternative.

    $hash = ($c + ($hash * 64) + ($hash * 65536) - $hash); $hash -= int($hash/2**32)*2**32;

    I haven't checked to see if this is safe for every unsigned 32-bit $c and every unsigned 32-bit $hash.
    It's safe for every unsigned 32-bit $c and every unsigned 32-bit $hash.

    Update: The above is a silly way of writing

    $hash = ($c + ($hash * 64) + ($hash * 65536) - $hash) % 2**32;

    I thought Perl didn't have a mod operator.

Re: Bitwise & stopped working?
by moritz (Cardinal) on May 12, 2010 at 16:51 UTC
    This is the output on my machine (with a 64bit Perl and OS):
    c = 97; hash = 61, masked hash = 61 c = 99; hash = 611842, masked hash = 611842 c = 99; hash = 613026f8a1, masked hash = 3026f8a1 c = 101; hash = 3032d2383004, masked hash = d2383004 c = 112; hash = d26bebd7d16c, masked hash = ebd7d16c c = 116; hash = ec11db888a08, masked hash = db888a08

    (where the hashes are in hexadecimal, and c in decimal).

    I'm not sure if that's the right explanation, but maybe the multiplication of a large number with 2**16 causes an automatic upgraded to floating point numbers on your (32bit) platform.

    You can verify this assumption by using Devel::Peek, and Dump the variable $hash after each iteration. It prints (PADMY,IOK,pIOK), and if my guess is right, it will print (PADMY,NOK,pNOK) on your platform.

    You could try to avoid that by using strings, and bit shifts instead of multiplications. But I'm sure other monks have more experience with bit manipulation magic than I have. Another (but probably inefficient) solution is to use bigint.

    Perl 6 - links to (nearly) everything that is Perl 6.

      Thanks muchly for the suggestions. It seems that executing the script whilst running a debug build via Visual Studio caused ActiveState's Perl to run rather than the version I was supposed to be using, indicating a path issue. This explains the sudden failure of bitwise AND.

      The notion that $hash is being converted to float appears to stand up; by changing the AND to "% 4294927296", the script works properly in both versions of Perl and all is well in my world again. :D

      Update: Oh, and to further show that it was a float issue, the ActiveState Perl was 32-bit, the version I normally use is indeed a 64-bit build.

        It seems that executing the script whilst running a debug build via Visual Studio caused ActiveState's Perl to run rather than the version I was supposed to be using, indicating a path issue.
        In the future, when something "suddenly stops working" and "I haven't changed anything", look to see if you actually have changed something (like the processing environment).

      it will print (PADMY,NOK,pNOK) on your platform.

      Who's pThere?

Re: Bitwise & stopped working?
by JavaFan (Canon) on May 12, 2010 at 16:41 UTC
    If you have a 32-bit perl, it can very well be that $hash * 65536 results into a value that cannot be expressed in 32 bits. Perl will upgrade the variable to a floating point number; $hash * 65536 can then still be expressed without loss of precision. However, '&' is a bitwise operator; if one of its operands is a floating point number, unexpected things may happen.

    You may want to run with a perl that is build to work with 64 bit integers. Or perhaps a use integer; does the trick for you.

      A simpler fix is just to swap to bit-shifts instead of multiplying by powers of two (Shift Operators):

      $hash = ($c + ($hash << 6) + ($hash << 16) - $hash) & 0xffffffff;

      $hash = ($c + (($hash & 0x03ffffff) << 6) + (($hash & 0x0000ffff) << 16) - $hash) & 0xffffffff;

      This has the advantage of making behavior independent of machine word length (assuming at least 32-bit, of course).

      Update: Modified as per ikegami's comment below. For some reason I thought Perl handled bit-shift overflow in a standardized fashion by dropping bits.

      Except the above is subject to arithmetic overflow once the addition stage is reached, with maximum error at $c=0xFFFFFFFF;$hash=0x0FFFFFFF;. Short of implementing a Full Adder, I think bit operations are not really reasonable. I give you the portable, overflow-immune hash function:

      sub ModifiedMakeHashString($) { my $string = shift; my $hash = 0; for my $c (map {ord} split //,$string) { my $six_shift = ($hash & 0x03ffffff) << 6; my $sixteen_shift = ($hash & 0x0000ffff) << 16; my $inverse = ~($hash-1) & 0xffffffff; $hash = floored_addition( floored_addition($c, $six_shift), floored_addition($sixteen_shift, $inverse)); #printf "c = %09x, 6s = %09x, 16s = %09x, inv = %09x, hash = % +09x\n", # $c, $six_shift, $sixteen_shift, $inverse, $hash; #printf "c = %d, hash = %d\n", $c, $hash; } return $hash; } sub floored_addition { my ($x,$y) = @_; for my $i (reverse (0 .. 31)) { my $check_mask = 1 << $i; if ($check_mask & $x & $y) { my $op_mask = ~(2**$i-1) & 0xffffffff; $x ^= $op_mask&$x; $y ^= $op_mask&$y; last; } last if $check_mask & ~$x & ~$y; } return $x + $y; }

      It runs about 10x slower than the original. Any obvious algorithmic improvements, other than the obvious?

        $c=00000000, $hash=03FFFFFF => $hash=FFFFFFFF 32-bit Perl $c=00000000, $hash=03FFFFFF => $hash=FBFEFFC1 64-bit Perl

        That doesn't fix the problem, even if you fix the two bugs you introduced. Quote perlop,

        The result of overflowing the range of the integers is undefined because it is undefined also in C.

        ($hash << 6) should be (($hash & 0x03ffffff) << 6)
        ($hash << 16) should be (($hash & 0x0000ffff) << 16)

        Update: My original counter example was bad. Fixed.

Re: Bitwise & stopped working?
by ambrus (Abbot) on May 14, 2010 at 10:35 UTC

    Incidentally, if you wanted to use arithmetic on 64-bit integers on the 32-bit perl, you could use the modules Math::Int64 or Math::BigInt. Eg.

    use Math::BigInt "lib", "GMP"; $x = Math::BigInt->new("123456789012345 +6789"); $y = Math::BigInt->new("1466709021345981"); warn ($x & $y); # OR use Math::Int64 "uint64"; $x = uint64("1234567890123456789"); $y = uin +t64("1466709021345981"); warn ($x & $y);

    These modules let you create objects that behave just like normal integers, but with more bits available.