in reply to Re^3: Associative array
in thread Associative array

Hm. Sounds like you're trying to start YAPS (Yet Another Perl Shibboleth). You'll excuse me if I do not wish you well with that.

  1. The alists & plists stuff is a read [intentially sic] herring:

    because, (to the best of my very limited Lisp knowledge), neither can perform the fundamental operation of both associative arrays and hashtables: that of random access of values by key.

    To the best of my knowledge, access to both is strictly sequential from the head.

  2. The implementation details are irrelevant:

    because associative array is an abstract datatype defined in terms of its operations. It can be implemented many different ways.

  3. The implicit ordering is a misnomer;

    You are conflating the order in which the data is accessed when iterated, with the ordering of the data itself.

    • Nobody except the rawest newbie expects AA/hashtables to be ordered.

      And when they do, its for the wrong reasons. As LW is reputed to have said: "iterating over the keys of a hash is like clubbing someone to death with a loaded Uzi".

    • Saying that ordinary arrays have implicit ordering is equally a misnomer.

      If you iterate an array in ascending index order; shuffle the contents and again iterate by ascending index, the data (values) will be differently ordered.

    • The pseudo-randomising of the hash seed for security reasons has as much to do with the iteration order of a hash, as the type of lock on a door has to do with the order people will come through it in the morning.

      The hash seed does not determine the order of iteration. That is (and remains) the sequentially ascending sequence of the bucket array (with diversions for non unitary buckets), just as for arrays.

      The hash seed simply determines which bucket the key/value pair is mapped to. Ie. It affects insertion not traversal.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
RIP PCW

Replies are listed 'Best First'.
Re^5: Associative array
by ELISHEVA (Prior) on Jul 14, 2009 at 05:46 UTC

    Your initial question gave me an opportunity to think through some issues that have been annoying me for some time and for that I am grateful. But I am puzzled by the response above.

    1. In Lisp values are randomly accessible by key via helper functions (assoc or find for alists, get for plists). Are you suggesting that the lack of a syntax to hide the need for helper functions disqualifies them?
    2. The implementation details are irrelevant ... please see my response to tilly: Re^5: Associative array.

    3. Viewing array indices as keys (as opposed to mere access order) is the theoretical basis for claiming that "associative arrays" are a generalization of normal arrays, i.e. they have a more general universe of possible keys. If anyone is doing the confusing, it is whoever first came up with the term "associative array".

      My objection to the term comes from the fact that generalizing the key set also changes the mathematical properties of that key set. And that in turn changes the range of valid operations on those keys and how we work with them on a theoretical and even practical level.

      What operations are different? Well, for one, in a normal array we can meaningfully talk about the distance between elements because integer keys can be subtracted one from another. We can also divide indices (or more precisely the distance between two indices) by some constant. Those two properties play an important role in certain searching and sorting algorithms (binary search and sort for example)

      And then there is ordering. A finite range of integers has a natural well-ordering based on '<='. An arbitrary set of keys only does if we have defined (and applied) a function that maps those keys to the set of positive integers (or some subset). There are many, many ways to do this (locale based, asciibetical). In fact it is mathematically guarenteed that we can find such a function for any finite set. However, this is an additional step that must be done in order to make an associative array behave like a normal array, assuming you even want to do that.

      But often we don't. We use normal arrays and "associative arrays" for very different kinds of algorithms. Associative arrays are best used when our algorithm relies on the semantic significance of each element. Normal arrays are used when our algorithm relies on the order of elements.

      Sorting and reordering only changes what value is associated with what key. It doesn't change the fundamental fact that the values are associated with integral keys and that integers have certain useful mathematical properties. In fact, one of the motivations for using normal arrays in sorting is their ability to map elements to a well ordered set of keys.

    4. Even if Perl is always using bucket order, isn't the net result a different key order? The documentation for keys says (I have bolded the key words):
      The keys are returned in an apparently random order. The actual random order is subject to change in future versions of perl, but it is guaranteed to be the same order as either the values or each function produces (given that the hash has not been modified). Since Perl 5.8.1 the ordering is different even between different runs of Perl for security reasons (see "Algorithmic Complexity Attacks" in perlsec).

      What am I misunderstanding there?

    Best, beth

      1. Are you suggesting that the lack of a syntax to hide the need for helper functions disqualifies them?

        No. I'm saying that (TTBOMK) alist & plists are lists and can only be sequentially accessed--which isn't random access, and therefore disqualifies them from being AAs. Ie. searching them is O(n) instead of O(1).

        As I also said, my Lisp knowledge is limited. Then best evidence I have for this are a couple of quotes from the Common Lisp Manual:hashtables: "Finding the value is very fast, even if there are many entries, because hashing is used; this is an important advantage of hash tables over property lists." & "adding a value to a hash table is a destructive operation; the hash table is modified. By contrast, association lists can be augmented non-destructively."

      2. I read the linked post.
      3. in a normal array we can meaningfully talk about the distance between elements because integer keys can be subtracted one from another. We can also divide indices (or more precisely the distance between two indices) by some constant. Those two properties play an important role in certain searching and sorting algorithms (binary search and sort for example)

        Hm. You can do math with the bucket indicies too. It won't allow you to write a binary search, but then you don't need to.

        'Arrays' & 'associative arrays' are different things. Like 'trees' and 'shoe trees'; 'salmon' and 'rock salmon'; .... See below.

      4. Even if Perl is always using bucket order, isn't the net result a different key order?

        Yes. But you would also get a different data ordering if:

        • they chose a better hashing algorithm;
        • or they adjusted the re-hash threshhold;
        • or if you insert or delete a key;

          The point is that the iteration ordering doesn't change because of the hash seed randomisation. The ordering of the keys (one of a pair of data items) changes. But the iteration order has nothing to do with the ordering of the values--for either hashes or arrays. I'll attempt to make that clearer :)

          The "implicit ordering" of arrays, has nothing to do with the ordering of the values held within it. If you take the same list of data and unshift them into an array; and push them onto an array; the iteration order remains the same; but the perceived ordering of the data will be different(reversed). And if you spliced them into the 'middle', you'd percieve a different ordering when you iterated the array sequentially.

          This is no different with hashes; the buckets are iterated sequentially+. Which perceived ordering values come out in, depends on where they were put during insertion, not the the iteration ordering. And the insertion ordering is dependant upon many things: the current size of the bucket array; the number of values added; the hashing algorithm (including the seed used). Even just adding some keys and then removing them again will caused the percieved ordering of the (same) keys to change:

          undef @h{ 1 .. 8 };; print keys %h;; 6 3 7 2 8 1 4 5 undef @h{ 9 .. 16 };; delete $h{ $_ } for 9 .. 16;; print keys %h;; 7 2 1 6 3 8 4 5
          What am I misunderstanding there?

          I believe you are conflating the iteration ordering and the (resultant) data ordering. The hash seed randomisation affects which buckets the keys are placed in, not the order those buckets are iterated.

          Sure, if all else remains unchanged, the net effect of hash key randomisation might be a difference in the perceived order of the keys as output. But if all else stays equal (including the seed), and any one of those other things changes, you will get the same perception. Just as you would with an array, if a tied array decided to use a hash as it's storage mechanism.

      P>Conclusion: Whilst I do see some potential for confusion between 'array' and 'associative array', it think it is minor. And I don't think most of your arguments in support of your assertion stand up to scrutiny. Programmers will quickly become familiar with the properties of associative arrays and not confuse them with arrays.

      And non-programmers won't know what either are, and might guess they are more similar than they really are. But that's really not a problem because they won't know the properties of arrays, so will have no basis for real confusion.

      Telling others not to use an industry standard term, because you've reached the conclusion that it's a bad one, is ...

      I don't object to your pointing out your objections to the term--I did that myself recently for undefined behaviour--and I might even agree with you. But it is one thing to raise and support one's objections to a industry standard term; it's all together a different thing to tell people off for using it.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.