in reply to Re: Associative array
in thread Associative array
Perl hashes are not associative arrays,
Perhaps you could explain what you see as the difference? (Cos a lot of people are happy to refer to Perl's hashes as 'associative arrays'.)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Associative array
by ELISHEVA (Prior) on Jul 13, 2009 at 08:09 UTC | |
Indeed. I put this somewhere in the same category as "lists in scalar context". It fits the orthography and good luck trying to get people to see differently. However, if one views a hash as an associative array one is likely to end up in all sorts of confusion, especially if one is first familiar with Lisp. Two of the most important sources of confusion are: (a) confusion of implementation with concept and (b) confusion over ordering. The name "associative array" confuses an implementation of a hash with the concept of a hash. Conceptually, a Perl hash is a data structure that will allow one to retrieve a value by name rather than position. There are many ways to implement this. For example, in Lisp there are two different ways to get hash-like effects: alists and plists. In many other languages, Perl included, hashes (sometimes referred to as dictionaries) are implemented using data structures that are much more complicated than either a-lists or p-lists. Perl in fact has two entirely different C structures for hashes (HV) and arrays (AV). Although both have ARRAY, FILL, and MAX to describe their memory usage, the similarity stops there. The HV structure is carefully organized to maximize the speed with which values can be retrieved by property name and to conserve storage (in some cases keys are stored as pointers to a shared set of keys rather than having their own private string representation). See PerlGuts Illustrated. Now, I suppose you could argue that even Perl stores key-value pairs in its ARRAY member and that makes it an associative array. However, that is a misleading oversimplification. And it brings us to the next issue: ordering. Another problem with viewing hashes as associative arrays is that an array has an explicit ordering to its elements, whether those elements are simple scalars or key-value pairs. Hashes, by contrast, have an undefined unordering. Failing to grasp the distinction can cause serious problems, including hard to track intermittent bugs. One who views a hash as an associative array is likely to assume that ordering of hash members is consistent between runs of programs and may perhaps base their program on that assumption. However, this is a dangerous assumption. Perl hashes have an unpredictable order by intent. One of the techniques to safeguard software from algorithmic complexity attacks is to randomize the order of hash keys, thus making it harder to find a predictable spot to insert dangerous code. See keys and perlsec for further discussion. Best, beth | [reply] [d/l] [select] |
by tilly (Archbishop) on Jul 13, 2009 at 09:57 UTC | |
As Wikipedia says, an associative array is an abstract data type that maps a set of keys onto values. The analogy to an array is that arrays associate indexes with values, while associative arrays associate more general keys with values. So it is like an array, but with a more flexible choice of association. (Depending on the language and/or library, keys may even be arbitrary objects.) There are many ways to implement them. In fact the two Lisp examples you gave are valid (albeit inefficient) implementations! The term "associative array" is used in many languages. You can see how ubiquitous the term is across languages by googling for associative array. When I did that, the top links were to the Wikipedia page I gave, then articles on PHP, JavaScript, Oracle PL/SQL, PHP again, an entry in a data structures dictionary, an article on C++ compilers, a Perl article, etc. Clearly the term is very well established in the broader programming world. By contrast when we say hash we're referring to the implementation. Perl hashes are internally implemented using hash tables. (Common Lisp also has hash tables built in.) However there is no guarantee that Perl's hashes will always be so implemented. Perl has changed hashing algorithms in the past, and theoretically could switch to using a BTree at some point in the future. (Extremely unlikely though.) | [reply] |
by Anonymous Monk on Jul 13, 2009 at 10:04 UTC | |
| [reply] |
by ELISHEVA (Prior) on Jul 13, 2009 at 11:59 UTC | |
When I first responded to the OP I suspect I was misreading the term "associative array" for "association list" (the particular way the OP was using hash and array reinforced that for me). But I persisted in my point because the more I think about it, I really do think that the term "associative array" is one of the not-so-smart-terms that has developed in our industry and is fundamentally misleading, despite a history that goes back at least as far as AWK. The term 'associative array' isn't implementation free to everyone, Wikipedia and even some recent text books aside. At the bottom of the Wikipedia discussion page for "associative array" there is a brief back and forth about potential confusion between the two terms "associative list" and "associative array" (the decision was to discourage implementation specific meanings). Consider also, for example, this definition from Mastering Algorithms with C, (Kyle Louden, O'Reilly, 1999) which defines an associative array as: Associative arrays consist of data arranged so that the nth element of one array corresponds to the nth element of another. p. 142 I agree with you that the term "hash" is also laden with implementation associations, though less so, in my opinion. Although the term hash implies a hash function (as opposed to a balanced tree which uses structure to facilitate retrieval), there are many different hash function implementations, so ultimately even a hash function is more of a concept than a specific implementation. More importantly, it is hard to make assumptions about the physical arrangement of a data structure when the "implementation" is a category of functions. I don't think the same thing can be said of the word "array". An array is a physical structure that supports both sequenced and random access. Because its indices are integers, it has implications for ordering that are directly contradictory to the intent of the abstract meaning of the term "associative array" used in the AWK documentation or the Wikipedia definition. Both those definitions attempt to generalize the term "array" by focusing on random access (via non-numerical keys) at the expense of sequenced access. There are, of course, better conceptual terms: "map", "dictionary", or even "association container". If we were talking Java, I would use the term Map (Java's abstract container for this purpose). However, in Perl we are stuck with "hash" and it is the Perl use of the word we need to explain. Best, beth | [reply] |
by tilly (Archbishop) on Jul 13, 2009 at 19:05 UTC | |
by BrowserUk (Patriarch) on Jul 13, 2009 at 14:34 UTC | |
Hm. Sounds like you're trying to start YAPS (Yet Another Perl Shibboleth). You'll excuse me if I do not wish you well with that. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
by ELISHEVA (Prior) on Jul 14, 2009 at 05:46 UTC | |
Your initial question gave me an opportunity to think through some issues that have been annoying me for some time and for that I am grateful. But I am puzzled by the response above. Best, beth | [reply] [d/l] [select] |
by BrowserUk (Patriarch) on Jul 14, 2009 at 08:49 UTC | |