Let's look at this from Perl internal view, to understand why it uses more memory than you expected.

When you insert a key-value pair, Perl would use its hash function to calculate a non-unique internal key (the internal key, which hash really used for indexing). Focus on the word "non-unique", which means there would be other hash elements sharing the same internal key.

Now perl need to allocate memory for this key-value pair. It can just allocate enough memory for this pair, but that would be really slow, as Perl has to allocate/reallocate memory each time when a new key-value pair is inserted. Performance vs. memory...

What Perl really did is to allocate a chunk of memory to store the pairS related to this internal key.

You can imaging that there would be lots of "waste" (in terms of memory, not speed) at the beginning, when there are not many key collisions. But the situation will get progressively improved, as more key collisions happen. When there is a key collision, there is a big chance that perl doesn't need to allocate memory for this internal key, as there is still space left to store pairS for this internal key, so instead of allocating new chunk of memory, it just fills up whatever allocated already (this means the actual usage of allocated memory is getting improved. Of course you would have to allocate another chunk, after the first chunk is used up. On the other hand, having too many key collisions is not a good news to the performance.)

In your case, your data is not big, so we can reasonablly expect there are not many key collisions, and lots of allocated memory are wasted. 16 times is not a surprise, especially when you are using a two-level hash, at the second level, you actually have many hashes. Even worse, if your second level hashes are all those kind of small ones, having 1 or 2 elements, there would be a big waste.

In reply to Re: A more memory efficient storage structure? by pg
in thread A more memory efficient storage structure? by JPaul

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.