in reply to Re: Hitting memory limit? (ActiveState Perl)
in thread Hitting memory limit? (ActiveState Perl)

On one test box (this is my laptop, I took a copy home with me):

Physical Memory
-----
Total: 261560
Available: 7000 (and dropping)
System Cache: 58088 (and climbing)


Commit Charge (K)
-----
Total: 381188 (pretty static)
Limit: 630652 (very static)
Peak: 395832 (very static)

Kernel Memory (K)
-----
Total: 43932 (very static)
Paged: 35876 (very static)
Nonpaged: 8056 (very static)

I watched it until Physical Memory:Available got all the way down to ~3000. At that point Physical Memory:Available jumped to ~20000, and at the same moment Physical Memory:System Cache jumped by ~1000. Then they both started dropping/climbing as before.

I will run with your Win32::API::Prototype script and post results at the first available opportunity (Friday, likely).
  • Comment on Re: Re: Hitting memory limit? (ActiveState Perl)

Replies are listed 'Best First'.
Re: Re: Re: Hitting memory limit? (ActiveState Perl)
by BrowserUk (Patriarch) on Jan 22, 2004 at 08:27 UTC

    Your Total Commited Charge far exceeds your Total Physical Memory--by about 50%. You've been into swapping for a considerable time.

    Although the TCC is fairly static, suggesting that the hash isn't growing much, each time you access a key within the existing hash, it's quite possible that that perl would have cycle through the entire hash structure to locate the next value, which in turn could mean the OS having to swap a huge portion of the process' image.

    You probably had trouble hearing your mp3 over the sound of the disc thrashing--your nieghbour's probably had the same problem:)

    Perls hashes are pretty memory hungry, and if you are nesting them, filling memory doesn't take too much effort. I induced this machine, with 512MB ram into swapping in 80 seconds with a hash containing a little under 6_000_000 keys with empty (undef) values. If your values are only simple scalars you'll get there much quicker. If they are themselves hashes or arrays, much quicker still.

    sub rndStr{ join '', @_[ map{ rand @_ } 0 .. shift ] }; $! = 1; ( $_ % 1000 or printf( "\r$_ : %s ", times ) ) and $h{ rndStr 8, 'a'..'z', 'A' .. 'Z', 0 .. 9 } = undef for 0 .. 10_000_000; 5858000 : 80.703

    You could find out how many keys it takes to induce swapping by disabling swapping and logging scalar keys %your_main_hash periodically as you fill it. If the last number displayed before you get perls "Out of memory" error is not to far short of your expected final size, then it might be worth looking at seeing how you could use less memory. If it is a long way short, then that effort is almost certainly not worth it.

    There are some alternatives to hashes which can save memory depending on the nature of your keys and what use you are making of the hash (see A (memory) poor man's <strike>hash</strike> lookup table. for one possible, part solution), but if you are regularly dealing with these volumes of data, a DB of some form is probably your easiest option.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    Timing (and a little luck) are everything!

      Thank you very much. I will definitely follow up on these suggestions. I suspect that best bang for my buck will be to look at some of the bigger data collection points and optimize (since that won't take very long to test), then (since that probably won't work) switch to a database as previously suggested. And additionally look to general optimization of memory use.
      Regarding memory optimization, and the information provided in the "poor man's hash" link... would there be any value to switching the primary data structure to a straight array, and using a hash as a lookup index (so the # of primary keys in the hash would be the same, but the value would be nothing more than a element # in the array) ?

        In a word, no. The overhead of a scalar (the hash values) is the same regardless of what it contains. Using a hash to point into an array would just add the overhead of the entire array structure.


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
        Timing (and a little luck) are everything!