Almost all of those extra 5 seconds is spent allocating and reallocating memory as the hash is extended.

In theory, you could pre-extend your hash by using keys as an lvalue: keys %hash = 1234567;.

In reality, there are two problems with this.

  1. The number you assign is the number of "buckets" that are pre-allocated to the hash; but there is no way to pre-determine how many buckets your hash will need, even if you can predict how many keys you will assign.
  2. The space allocated to "buckets" is only a tiny proportion of the space required to build the hash.

    If you run the following 1-liner:

    perl -e"keys %hash = 2_000_000; <>; $hash{ $_ }++ for 1 .. 2_000_000; +<>;"

    When activity stops at the first input, the space allocated to the process will be around 18 MB.

    If you then hit enter, the hash is populated and the space requirement grows to around 192 MB.

Most of the time is spent allocating the space for the individual keys and values, not the buckets (it seems?).

In the past, I've tried various schemes to try and grab the required space from the OS in a single chunk rather than in zillions of iddy-biddy bits, but I haven't found a way of doing this using the memory allocator used by the AS built perl's.

You can very quickly grab a large chunk of memory from the OS using 'X' x ( 200*1024**2); for example.

D:\>perl -we"'X'x(200*1024**2); <>; $h{ $_ }++ for 1 .. 2_000_000;<>" Useless use of repeat (x) in void context at -e line 1. Name "main::h" used only once: possible typo at -e line 1.

Despite the "Useless use of repeat (x) in void context" message, you'll see that at the first prompt, the requried 200 MB of space has been allocated to the process (almost instantly), but when you hit enter, the process then goes on to grab 200MB more space.

Despite that the initial 200 MB was never assigned to anything--that space is never reused. I've tried in vain to find a way to pursuade perl to reuse it.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco.
Rule 1 has a caveat! -- Who broke the cabal?

In reply to Re: How do I measure my bottle ? by BrowserUk
in thread How do I measure my bottle ? by cbrain

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.