Re: Hash tables, are they really what we see?
by GrandFather (Saint) on Oct 03, 2012 at 04:46 UTC
|
To a very large extent just let Perl do its thing. Stuff that programmers do often Perl has been optimised to do fast. Almost always a list or tree in other languages is a solution to an intermediate problem which you can solve directly using Perl's hash or dynamic array structures.
A Perl hash is an associative array which stores values that are accessed using keys. Under the hood the key turns into a "hash" (hence the name of the structure) which Perl is very quick at looking up. The whole point of a hash type data structures is that the lookup is fast (due to the hash magic) and finding something isn't there is just as fast as finding it is there. The time in both cases is essentially a small constant time. The trick is that the key gets turned by a hashing function into an index into a hash table so (simplifying greatly) the time to find (or not find) an entry is the time the hash function takes plus the time to index into the table - a short and almost constant time. Hashes are generally the answer to problems you might solve using trees in other languages.
Perl is very time efficient at managing dynamic arrays that can be easily used in the ways you might want to use linked lists. In particular Perl's arrays are fast for adding and removing blocks of elements at each end and are pretty fast for adding and removing blocks of elements elsewhere in arrays. Under the hood Perl does clever stuff with linked lists, but you don't need to know that.
True laziness is hard work
| [reply] |
|
| [reply] |
Re: Hash tables, are they really what we see?
by dsheroh (Monsignor) on Oct 03, 2012 at 08:26 UTC
|
But let`s assume we have a hash with, for example over 1 000 000 entries as a word count and we then search for a word that just does not appear there? How in blazes Perl will know that there is not that word from a million of words!? I just picture a barrel filled with red and blue balls over a million and you have to tell that you are 100% there is no other colors by just looking at that barrel.
Your barrel is the wrong image for visualizing a hash. The magic of hashes is that you don't do hash searches, you do hash lookups. When you ask for $hash{foo}, Perl doesn't have to examine every key in %hash to see whether foo is among them, instead, it calculates "If a key foo exists, then it will be in this location.", then looks only in that location.
For a better real-world image, think of a hotel with mailboxes on the wall behind the front desk. When you want to see whether you have any mail, you don't tell the clerk your name and then wait for them to check every piece of mail to see who it's addressed to. Instead you say "I'm in room 234" and the clerk looks only at the box numbered 234. If there's anything in that box, you have mail; if there isn't, you don't. (In this example, you stating your room number is analogous to the hashing function used by Perl to map hash keys to "buckets" in the hash.)
| [reply] [d/l] [select] |
|
In abstraction you mean let`s say it prereserves a cells in memory that are expected to be either full or empty so when I tell him, "perl I want a key name1, which I don`t remember if I sotred.", he looks into some 0xFF00BB3012, which expects to be in the memory for the hash and tells me: "nope, I don`t have what you want in that address location so I won`t give you the coresponding value from the coresponding address." Quite abstract stuff, but as far as I got it, perl kind of flags these addresses and then directly access them with lowlevel operations which are too much for me to bear...
| [reply] |
|
| [reply] |
Re: Hash tables, are they really what we see?
by remiah (Hermit) on Oct 03, 2012 at 05:19 UTC
|
Hello.
I did some benchmark for perl hash vs in-memory SQLite several days ago. Number of records that you are interested seems similar with my case, have a look at this thread if you are interested in.
Hash lookup is fast. There are cases in-memory sqlite is better than hash.
| [reply] |
|
| [reply] |
|
| [reply] |
Re: Hash tables, are they really what we see?
by AnomalousMonk (Archbishop) on Oct 03, 2012 at 07:35 UTC
|
| [reply] |
|
| [reply] |
Re: Hash tables, are they really what we see?
by locked_user sundialsvc4 (Abbot) on Oct 03, 2012 at 13:45 UTC
|
The analogy that I use for hashes is that of a set of post-office boxes ... which, for the purposes of the analogy, might be shared among a number of different people. Based on what the envelope of the incoming piece of mail says, it will always be placed into only one box. When someone comes to ask for their mail, you’ll look only in that box, but you still might have to look through the contents of that one box to find what you’re looking for.
Some database systems have been built which do provide “hash” indexes. They are fast and efficient, provided that both the data-distribution and the hash function h() are such that there are not an excessive number of collisions ... where too-much mail winds up in the same box.
Hash structures are low-maintenance, but they are also random-access-only. You can’t iterate through values naturally in key-order. (Although a hash-variable that is tied to a Berkeley-DB file will yield its contents in key-order to each().) Hashes do not require the rebalancing steps that are required by trees. They are so useful, and therfore so often used, that every major language now has a rock-solid high performance implementation of them.
| |
|
Hashes do not require the rebalancing steps that are required by trees.
Well, yes they do. Anything that grows dynamically, without knowing the whole list ahead of time, may need to be rebalanced. But Perl does that for you.
-QM
--
Quantum Mechanics: The dreams stuff is made of
| [reply] |