Beginners are understimating the hashes utility if not ignoring their mere existence. Exprimented users know when to choose an hash or an array. I am trying here to formalize the choice process.

We have a set of elements. Should we shove it in an array or a hash? If the set is unordered, it is clear that the choice is to use a hash:
$hash{$_}++ for   qw( one_element another still_another foo foo );
If we have duplicate elements we get almost for free as associated values the number of duplicates for a given element. Note that we don't have lost the ability to iterate over the set. In the following iteration, duplicates are made consecutive.
print "$_\n" for  map { my $i=$_; map { $i } 1.. $hash{$_} } keys %hash

If the set is ordered, it may be a stack or a queue; or a static list which elements are accessed by rank. In Perl all of those are stored in arrays. We don't have to remember fancy classes names here unlike the OO bondage languages.

Now the interesting case... and less intuitive. The set has no duplicates and is ordered by some (possibly costly) function of the elements values and we are adding and supressing elements all the time. Because it is ordered, we are tempted to use an array; but insertion and deletion in a an array are complex and costly (splices), We may think of a linked list, but really the natural choice is usually the hash (or 2 hashes). Let's call val the ordering function.

We start by the case of one hash
Note that if we don't have an explicit ordering function, the order is just the the elements appearance order: my $i=0; $hash{$_}=$i++  for   qw( one_element another still_another foo  );

Adding an element: $hash{$elt}= val $elt
Suppressing an element: delete $hash{$elt}
Iterating with order: print "$_\n" for sort { $hash{$a} <=> $hash{$b} } keys %hash;

Because we have a one to one correspondance between keys and values, we could use a an additional hash: a "reverse" hash %rev. It has the property: $hash{$rev{$_}} == $_ == $rev{$hash{$_}}
Iterating would be cheap: print "$rev{$_}\n" for sort keys %rev; But the price to pay is to maintain two hashes instead of one when adding or deleting elements. So depending if we spend more time iterating over element or adding and deleting elements, we should use respectively two or one hash. Space consideration may also a concern if using two hashes. In both cases, each iteration incurs the cost of a sort: n log2 n

So if adding and deleting an element is really the rare case compared to iterating, than you are better using an array and the joice of splicing :(

Sure enough, when we want to more operations than adding, deleting and iterating, richer data structures must be considered.

This reflexion was triggered by On Removing Elements from Arrays

Corrected to suppress the unneeded Schwartzian Transform as suggested below by merlyn below. The ordering function is already cached as value associated to key. The whole point of an ST is to avoid to recalculate that function over and over. The correction does not change the validity of the points made in that node. Thanks merlyn

-- stefp


In reply to Using a hash for an ordered set can make sense by stefp

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.