static long
string_hash(PyStringObject *a)
{
register Py_ssize_t len;
register unsigned char *p;
register long x;
if (a->ob_shash != -1)
return a->ob_shash;
len = Py_SIZE(a);
p = (unsigned char *) a->ob_sval;
x = *p << 7;
while (--len >= 0)
x = (1000003*x) ^ *p++;
x ^= Py_SIZE(a);
if (x == -1)
x = -2;
a->ob_shash = x;
return x;
}
we can see that it is not whether the platform itself is 64-bit that matters,
but whether the long type
used by the C compiler that Python was built with is 64-bit.
For Python built with a 32-bit long my solution should work, for a 64-bit long it will not.
On 64-bit architectures, Windows C compilers tend to use the LLP64 programming model (32-bit long),
while most others tend to use the LP64 model (64-bit long).
From this stack overflow question:
The true "war" was for sizeof(long), where Microsoft decided
for sizeof(long) == 4 (LLP64) while nearly everyone else decided for sizeof(long) == 8 (LP64).
Note that a programming model is a choice made on a per-compiler basis,
and several can coexist on the same OS. However, the programming model
chosen as the primary model for the OS API typically dominates.
Hmmm, I see from this later stringobject.c
that _Py_HashSecret_* has been added, presumably to protect against DoS attacks that exploit hash collisions in Python dictionaries.
static long
string_hash(PyStringObject *a)
{
register Py_ssize_t len;
register unsigned char *p;
register long x;
#ifdef Py_DEBUG
assert(_Py_HashSecret_Initialized);
#endif
if (a->ob_shash != -1)
return a->ob_shash;
len = Py_SIZE(a);
/*
We make the hash of the empty string be 0, rather than using
(prefix ^ suffix), since this slightly obfuscates the hash secre
+t
*/
if (len == 0) {
a->ob_shash = 0;
return 0;
}
p = (unsigned char *) a->ob_sval;
x = _Py_HashSecret.prefix;
x ^= *p << 7;
while (--len >= 0)
x = (1000003*x) ^ *p++;
x ^= Py_SIZE(a);
x ^= _Py_HashSecret.suffix;
if (x == -1)
x = -2;
a->ob_shash = x;
return x;
}
See also:
|