Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

comment on

( [id://3333] : superdoc . print w/replies, xml ) Need Help??

The short answer is that you can't. The long answer deals with why you can't:

A 'hash function' reduces any input to a number within a given range. In most cases, this involves some kind of bit-twiddling:

$HASH_LIMIT = 1024; ## myhash (key:string) : int # # calculate a checksum # sub myhash { my $key = shift; my $h = 0; for $c (unpack ('c*', $key)) { ## break the key into chars $h *= 2; ## shift the hash value left one bit $h ^= $c; ## and XOR it with the next char } $h %= $HASH_LIMIT; ## trim to fit the defined range return ($h); ## and return the result } for $k (qw( testing the hash function )) { printf ("%10s - %s\n", $k, &myhash($k)); }

A 'hash' is a data structure that stores values in an array, using the value returned by a hash function as an array index:

@MY_HASH = (0) x $HASH_LIMIT; ## put (key:string, val:string) : nil # # store a value in a hash # sub put { my $key = shift; my $val = shift; $MY_HASH[ &myhash($key) ] = $val; return; } ## get (key:string) : string # # get a value from the hash # sub get { my $key = shift; return ($MY_HASH[ &myhash($key) ]); }

and this code is admittedly trivial because it doesn't check for collisions.. cases where two keys return the same hash value. Collisions are inevitable when you deal with hash functions, because hash functions map an infinite domain of input strings to a finite range of output values. It's just hard (very hard) to calculate which strings will map to a given hash value. Most of the interesting work involving hashes revolves around what to do with collisions, but there are too many alternatives to discuss here.

Hashes are great when you want to work with non-numeric information. Calculating the index value directly from the key is (usually) faster than searching for a specific string in a sorted array. It's also (usually) easier to store a new value in a hash than to shoehorn one into a sorted array, though there are data structures (heaps) which do just that.

The down side is that you can't assume your hash function will preserve attributes like alphabetical order. It's just a tradeoff you just have to accept if you're going to store information in a hash.

In reply to Re: More Sorted Business with Hashes by mstone
in thread More Sorted Business with Hashes by Spenser

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.