Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

A web application that I'm working on provides a "private" record type that does not show up unless a URL specific to that record is entered. I would like to provide an interface to be able to access these "private" records by entering a key. The custom URLs just take advantage of the primary key on the table as security is not a concern. However, I would like to use something more obscure for the other entry. I would use Digest::MD5, but I want a hash that is only 5 or 6 characters long. Cryptographic security isn't an issue here; I just don't want people to see that they can just try different, small integers in order to guess at records.
That looks like a lot for what seems like a simple question: is there a hash that will produce 5 or 6 character output?

Replies are listed 'Best First'.
Re: Short hash algorithm
by traveler (Parson) on Nov 18, 2004 at 00:08 UTC
    There are hashes that will hash into any number of "buckets". Since you do not care about security or "accuracy", what I have done in the past is just to use the first or last characters of the MD5 hash. It has always been random enough for me. There is a danger of collision, but that is true of a smaller hash function, too.

    It looks as though you need about 32 bits of hash (5 or 6 bits/char depending on char set and 5 or 6 chars). How about Digest::Adler32 or CRC32?

    --traveler

Re: Short hash algorithm
by kvale (Monsignor) on Nov 18, 2004 at 00:18 UTC
    If you are not worried about security or collision frequency, you could use a mod:
    my $short = $digest % ($largest_int_wanted + 1);
    Alternately, you could use the 5-6 least significant characters of the digest.

    -Mark

      Taking the modulus doesn't prevent people from guessing the next and/or previous key, as it will likely still be n+1 or n-1, if the modulus is large enough. But based on the modulus idea, one could use the linear method used by random number generators - take a large prime number, multiply the record number by it, and then take the modulus - this should give a fairly good distribution, but I'd still look at what Knuth (or some other book) says about the algorithm. I think it's called Linear Random Number Generator, but it might be called something else.

Re: Short hash algorithm
by bart (Canon) on Nov 18, 2004 at 19:43 UTC
    Use bitreverse (see Bit::Vector, Reverse() for example), or another similar bitswapping mechanism, that way you can map any integer to another integer, in a unique way, reversible way. Most users won't have a clue to what's happening.

    Another idea is to just pick a random number, check it for uniqueness, and store it in a database table for reverse lookup.

Re: Short hash algorithm
by Random_Walk (Prior) on Nov 18, 2004 at 12:06 UTC

    You could use your index as a seed then call rand with some appropriate value

    perl -e 'for (1..10) {srand($_); print int rand (100000), $/}' # output 51385 2770 54159 5545 56930 8319 59704 11090 62475 13864

    Cheers,
    R.

Re: Short hash algorithm
by TedPride (Priest) on Nov 18, 2004 at 09:21 UTC
    Use MD5 and take the last x number of characters rather than the whole thing.

    EDIT: As BrowserUk pointed out in a PM to me, this was already said in the first post to the thread. My bad.