This is speculation, but it seems to fit your description. Maybe someone from p5p can confirm or deny it?

When a hash fills (to some level), it reallocates memory to increase the number of buckets. When it does it doubles the number.

It won't happen for every new addition as a hash consists of buckets and chains, and if a new addition hashes to the same value as an existing word, it will be added to the end of the chain for the corresponding bucket and no new buckets will be needed. If however, an new word hashes to a previously unused hash value, then the percentage filled criteria will be met and the hash will be reallocated. Ie. a new hash allocated with double the number of buckets and the contents of the existing hash will be re-hashed(?) and moved into the new hash. This process will (temporarily) require 3x the number of buckets (though not the chains I think).

If you are close to your memory limits already, that reallocation could be the cause of the slow down perhaps. Even the 50% increase may not be enough to allow that reallocation.

Note: What I'm describing (badly), is the operations of a normal hash. I'm not sure if MLDBM operates in anything like the same way.

A possibility to test this is to try printing scalar %hash just before and just after adding one of the problem words. How to fix it is another problem, but if you could confirm that my speculation has some foundation, it might lead to a solution.

Like I say, your description made me think about the buckets thing, so I thought I would mention it.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

In reply to Re: MLDBM performance problem by BrowserUk
in thread MLDBM performance problem by henrywalker23

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.