Hi

Question: Do different hashes optimize space by sharing duplicated strings for keys?

I did some RAM benchmarking by generating many small hashes and arrays and was irritated that the hashes only needed <30% more memory than the arrays holding the values.

Then I realized that all hashes were generated with the same 5 keys.

When I created the hashes with random keys of same length, the need for memory doubled.

Like for 100_000 hashes with 5 entries

Devel::Peek shows a flag SHAREKEYS

DB<424> Dump \%h SV = IV(0x3f1b368) at 0x3f1b378 REFCNT = 1 FLAGS = (TEMP,ROK) RV = 0x3452878 SV = PVHV(0x333d148) at 0x3452878 REFCNT = 2 FLAGS = (OBJECT,OOK,SHAREKEYS) STASH = 0x31619a8 "TST" ...

these are test-samples of the generated data

\@ARGV: ["test", "-n_objs=3", "-n_attr=5", "-share_keys=0"] at d:/tmp/ +pm/bless_hash_overhead.pl line 21. \%opt: { -n_attr => 5, -n_objs => 3, -share_keys => 0 } at d:/tmp/pm/b +less_hash_overhead.pl line 22. \@types: ["test"] at d:/tmp/pm/bless_hash_overhead.pl line 23. Running Test Demo at d:/tmp/pm/bless_hash_overhead.pl line 53. Id : 14940 PM : 2945024 WS : 7933952 *** hash *** [ { "_0_852324.26186940" => 1, "_1_164918.79415432" => 1, "_2_708618.00181786" => 1, "_3_693984.59335475" => 1, "_4_318308.30209681" => 1, "_5_354558.95958874" => 1, "id" => 1, }, { "_0_941653.32000406" => 2, "_1_026454.70369051" => 2, "_2_708950.10790847" => 2, "_3_858794.53324083" => 2, "_4_112273.33593869" => 2, "_5_135182.85796952" => 2, "id" => 2, }, { "_0_926854.36297414" => 3, "_1_445275.68350328" => 3, "_2_111781.50175010" => 3, "_3_326689.53909899" => 3, "_4_069944.31695859" => 3, "_5_650963.32051237" => 3, "id" => 3, }, ] at d:/tmp/pm/bless_hash_overhead.pl line 56.
C:/Perl_524/bin\perl.exe -w d:/tmp/pm/bless_hash_overhead.pl test -n_o +bjs=3 -n_attr=5 -share_keys=1 \@ARGV: ["test", "-n_objs=3", "-n_attr=5", "-share_keys=1"] at d:/tmp/ +pm/bless_hash_overhead.pl line 21. \%opt: { -n_attr => 5, -n_objs => 3, -share_keys => 1 } at d:/tmp/pm/b +less_hash_overhead.pl line 22. \@types: ["test"] at d:/tmp/pm/bless_hash_overhead.pl line 23. Running Test Demo at d:/tmp/pm/bless_hash_overhead.pl line 53. Id : 14752 PM : 2924544 WS : 7909376 *** hash *** [ { _0_xxxxxxxxxxxxxxx => 1, _1_xxxxxxxxxxxxxxx => 1, _2_xxxxxxxxxxxxxxx => 1, _3_xxxxxxxxxxxxxxx => 1, _4_xxxxxxxxxxxxxxx => 1, _5_xxxxxxxxxxxxxxx => 1, id => 1, }, { _0_xxxxxxxxxxxxxxx => 2, _1_xxxxxxxxxxxxxxx => 2, _2_xxxxxxxxxxxxxxx => 2, _3_xxxxxxxxxxxxxxx => 2, _4_xxxxxxxxxxxxxxx => 2, _5_xxxxxxxxxxxxxxx => 2, id => 2, }, { _0_xxxxxxxxxxxxxxx => 3, _1_xxxxxxxxxxxxxxx => 3, _2_xxxxxxxxxxxxxxx => 3, _3_xxxxxxxxxxxxxxx => 3, _4_xxxxxxxxxxxxxxx => 3, _5_xxxxxxxxxxxxxxx => 3, id => 3, }, ] at d:/tmp/pm/bless_hash_overhead.pl line 56.

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery


In reply to do separate hashes share common keys? by LanX

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.