in reply to Re^2: Digest (checksum) Algorithm for 12 Decimal Digits?
in thread Digest (checksum) Algorithm for 12 Decimal Digits?

does that mean I've ... preserved the algorithms bucket-distribution-equality?

Yes - so long as you're ignoring the same bit positions every time.

Let's assume you need to keep 4 digits of an 8 digit number - and you decide to keep the first 4 digits. That would be ok, so long as the number "123456" was saved as "12" (0012) and not "1234". To avoid that sort of trap, it's probably best to save the *trailing* digits, rather than the leading digits. Each 4-digit "abcd" will then represent 10000 different 8-digit numbers - and the distribution is unaffected.

Note that if, in the above example, "123456" was being saved as "1234", then "1234567" would also be saved as "1234". They've reduced to the same number - though they ought to have been reduced to different numbers ("12" and "123" respectively). I think that sort of treatment would introduce bias.

Cheers,
Rob
  • Comment on Re^3: Digest (checksum) Algorithm for 12 Decimal Digits?