flexvault has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I use hashes all the time, but I haven't had this problem before. I found a work-around, but it doesn't make sense! So any suggestions will be appreciated.

I have one hash(%RCache) that in this case should be approximately 8192 key/value pairs. Each key is a address of a memory location on disk, each value is at most 2048 byte string. The code shows 'our %RCache', but that's part of the work-around. I had it originally as 'my' which worked fine, but I could not reset it in the 'CheckMemoryUsage' subroutine.

The subroutine 'GetSubBuffer' is where I test for it in the hash and if not in the hash I then get the record from disk and save a copy in %RCache. That is the only place I use '%RCache' except to reset it in 'CheckMemoryUsage'.

use strict; use warnings; use constant CACHESIZE => 2**24; ## Default size of cache memory (2**2 +4=>16_777_216) use constant PAGESIZE => 2048; our %RCache = (); keys( %RCache ) = 1020; $dbenv{DB_RMaxkeys} = CACHESIZE / PAGESIZE; . . . if ( scalar keys %RCache > $dbenv{DB_RMaxkeys} ) { &CheckMemoryUsage(' +WR'); } . . . if ( scalar keys %RCache > $dbenv{DB_RMaxkeys} ) { &CheckMemoryUsage(' +RN'); } . . . if ( scalar keys %RCache > $dbenv{DB_RMaxkeys} ) { &CheckMemoryUsage(' +RP'); } . . . # my $ret = &GetSubBuffer(\$db,$subtreeptr,\$buffer, $log ) sub GetSubBuffer { my ( $db, $ptr, $buffer, $log ) = @_; my $size = $dbenv{DB_Intern +alPageSize}; if ( ( $CACHE == 1 )&&( exists $RCache{$ptr} ) ) { $$buffer = $RCache{$ptr}; } else { if ( $ptr < $size ) { die " GetSubBuffer: $log subtreeptr $p +tr <= 0\n"; } $ret = sysseek( $$db{btree}, $ptr, 0); # move to subtre +e location in file if ( ! defined $ret ) { die " GetSubBuffer: $log sysseek fail +ed:|$ptr| $!\n"; } $ret = sysread( $$db{btree},my $tmpbuf, $size ); if ( $ret != $size ) { die " GetSubBuffer: $log sysread faile +d: $!\n"; } my $reclen = unpack("N", substr($tmpbuf,0,4)); if ( ! defined $$buffer ) { die "$log buffer not defined!"; } substr($$buffer,0,$reclen+4,$tmpbuf); #*# if ( $CACHE == 1 ) { $RCache{$ptr} = $$buffer; } } } sub CheckMemoryUsage { our %RCache; use Devel::Size qw(total_size); my $log = shift; my $stime = gettimeofday; my $keys = scalar keys %RCache; print $DLOG "MEM_CK-$log: Enter: Keys:$keys \%RCache Size: ",to +tal_size(\%RCache),"\n"; my ( $vmem, $rmem ) = &Display_Mem_Usage($$,$NAME,0); my $rkeys = scalar keys %RCache;
## 1. Failing code: %RCache = (); keys( %RCache ) = 1020; ## THIS DOESN'T WORK AFTE +R 1ST PASS
## 2. This works: if ( $rkeys > $dbenv{DB_RMaxkeys} ) { my $killkeys = int($rkeys/2); foreach my $key ( keys %RCache ) { $killkeys--; if ( $killkeys < 0 ) { last; } delete $RCache{$key}; } }
my $etime = sprintf("%.4f",gettimeofday - $stime); $keys = scalar keys %RCache; print $DLOG " Exit: Keys:$keys \%RCache Size: ",tota +l_size(\%RCache)," Time:$etime\n"; }

When I run it as 1. above, the results are:

############################# Tue Jan 3 10:49:32 2012 Start. . . ## Start: VSZ-6928_KB-0 RSS-3988_KB-0 BLOCK: 2048 Tue Jan 3 10:49 +:32 2012 MEM_CK-WR: Enter: Keys:8193 %RCache Size: 16027140 Exit: Keys:0 %RCache Size: 65592 Time:0.0304 MEM_CK-WR: Enter: Keys:8193 %RCache Size: 70629951 Exit: Keys:0 %RCache Size: 65592 Time:0.0375 MEM_CK-WR: Enter: Keys:8193 %RCache Size: 104741100 Exit: Keys:0 %RCache Size: 65592 Time:0.0399 MEM_CK-WR: Enter: Keys:8193 %RCache Size: 105994568 Exit: Keys:0 %RCache Size: 65592 Time:0.0406 MEM_CK-WR: Enter: Keys:8193 %RCache Size: 111110214 Exit: Keys:0 %RCache Size: 65592 Time:0.0414 MEM_CK-RN: Enter: Keys:8193 %RCache Size: 112428340 Exit: Keys:0 %RCache Size: 65592 Time:0.0480 MEM_CK-RN: Enter: Keys:8193 %RCache Size: 17416496 Exit: Keys:0 %RCache Size: 65592 Time:0.0492 MEM_CK-RN: Enter: Keys:8193 %RCache Size: 17416495 Exit: Keys:0 %RCache Size: 65592 Time:0.0526 MEM_CK-RP: Enter: Keys:8264 %RCache Size: 253096813 Exit: Keys:0 %RCache Size: 65592 Time:0.0644 MEM_CK-RP: Enter: Keys:8220 %RCache Size: 400929483 Exit: Keys:0 %RCache Size: 65592 Time:0.0782 ## End: VSZ-465472_KB-0 RSS-462424_KB-0 Diff: 458544|458436_KB-0 ####### Devel::Size ######### %RCache: 374,120,916 No of keys: 7481

Look how large %RCache gets, but when I run as 2., I get the below results. Much better! I put the commas in the size numbers so you can see how much larger %RCache is in the first case and acceptable in the 2. work-around.

############################# Tue Jan 3 10:01:59 2012 Start. . . ## Start: VSZ-6928_KB-0 RSS-3988_KB-0 BLOCK: 2048 Tue Jan 3 10:01 +:59 2012 MEM_CK-WR: Enter: Keys:7325 %RCache Size: 14293576 Exit: Keys:3663 %RCache Size: 7165898 Time:0.0285 MEM_CK-WR: Enter: Keys:7325 %RCache Size: 16593819 Exit: Keys:3663 %RCache Size: 7170630 Time:0.0300 . . . MEM_CK-WR: Enter: Keys:7392 %RCache Size: 31464649 Exit: Keys:3696 %RCache Size: 7251138 Time:0.0353 MEM_CK-WR: Enter: Keys:7326 %RCache Size: 31293329 Exit: Keys:3663 %RCache Size: 7189706 Time:0.0348 MEM_CK-RN: Enter: Keys:7325 %RCache Size: 30792296 Exit: Keys:3663 %RCache Size: 7189758 Time:0.0339 MEM_CK-RN: Enter: Keys:7325 %RCache Size: 14945032 Exit: Keys:3663 %RCache Size: 7189758 Time:0.0381 . . . MEM_CK-RN: Enter: Keys:7325 %RCache Size: 14945069 Exit: Keys:3663 %RCache Size: 7189758 Time:0.0490 MEM_CK-RP: Enter: Keys:7361 %RCache Size: 26095864 Exit: Keys:3681 %RCache Size: 7280837 Time:0.0484 . . . MEM_CK-RP: Enter: Keys:7361 %RCache Size: 31337851 Exit: Keys:3681 %RCache Size: 7291441 Time:0.0477 ## End: VSZ-127068_KB-0 RSS-124144_KB-0 Diff: 120140|120156_KB-0 ####### Devel::Size ######### %RCache: 21,012,262 No of keys: 5763

This doesn't make sense, but I also tried to undef %RCache, and that didn't work. I have never worked with a hash this large, but I hope that it shouldn't make a difference. In testing, I did everything using '2**20', and I didn't see a memory leak, but when I used '2**24' and above, something is going wrong! 'Devel::Size' was not in the original code, but added to try to figure out what is going wrong.

I also tried this on 2 other *nix boxes, with the save results. I tried Perl 5.8.8, 5.10.1, 5.12.2 and 5.14.1. So any suggestions?

Thank you

Update: I just looked at the code and I see I'm passing a reference to '%RCache' to 'Devel::Size', but I had the problem before I added 'Devel::Size'.

"Well done is better than well said." - Benjamin Franklin

Replies are listed 'Best First'.
Re: Hash memory leak: posible scope issue?
by BrowserUk (Patriarch) on Jan 03, 2012 at 19:23 UTC

    Does all the code above reside in the same file and package?


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

      Yes, for testing I combined the test-case with the code for simplicity. Should I separate them?

      "Well done is better than well said." - Benjamin Franklin

        Should I separate them?

        No. My thought was that the only way I could see of explaining your results is that your initial declaration:

        our %RCache = (); keys( %RCache ) = 1020;

        was in a different package to your reset subroutine:

        sub CheckMemoryUsage { our %RCache; ...

        and that because you redeclared the hash locally to the reset sub, you were manipulating two different hashes. But that doesn't make sense as the ENTER size would then be zero.

        I've been unable to reproduce your results in a simplified test:

        #! perl -slw use strict; use Devel::Size qw[ total_size ]; our %cache; keys %cache = 1020; sub resetCache { our %cache; printf "Before: keys: %u size: %u\n", scalar keys %cache, total_size( \%cache ); %cache = (); printf "After: keys: %u size: %u\n", scalar keys %cache, total_size( \%cache ); } while( 1 ) { $cache{ int( rand 2**32 ) } = chr(0) x 2048; if( keys %cache >8192 ) { resetCache(); } } __END__ C:\test>junk29 Before: keys: 8193 size: 17727631 After: keys: 0 size: 131144 Before: keys: 8193 size: 17727579 After: keys: 0 size: 131144 Before: keys: 8193 size: 17727590 After: keys: 0 size: 131144 Before: keys: 8193 size: 17727614 After: keys: 0 size: 131144 Before: keys: 8193 size: 17727556 After: keys: 0 size: 131144 Before: keys: 8193 size: 17727578 After: keys: 0 size: 131144 Before: keys: 8193 size: 17727621 After: keys: 0 size: 131144 ...

        And the question was me speculating about possible scenarios.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        The start of some sanity?