I'm trying to use Devel::Size to get the size of a large hash -- 5 million or so entries, where the keys are random strings of 32 characters plus "\n", and the values are all undef

If I run the code below, which builds a hash of just 1 million entries I get:

                   :  Start   : 1st Read : undef    : 2nd Read :
    Hash size      :   0.0 MB :  71.9 MB :   0.0 MB :  71.9 MB :
    Total hash     :   0.0 MB :  94.8 MB :   0.0 MB :  94.8 MB :
                   :          :          :          :          :
    Virtual Memory : 100.9 MB : 391.1 MB : 383.1 MB : 399.1 MB :
    Resident       :   2.8 MB : 292.9 MB : 284.9 MB : 300.9 MB :
So... having 1st read 31.5 MB of file, the hash is apparently 94.8 MB (71.9 MB given by Devel::Size::size(\%hash) plus 24 * 1E6, where each undef entry is 24 bytes.) This is odd -- 94.8 MB of hash appears to require ~ 290 MB of memory ? I tried commenting out the assignment $hash{$_} = undef, and the loop consumed no memory at all -- so whatever is going on, it's to do with the hash !

When I undef %hash the hash, the memory footprint reduces by just 8 MB (391.1 MB -> 383.1 MB). I cannot tell what free space Perl has, so this may or may not be reasonable.

When the file is read the 2nd time, the memory footprint grows to a bit more than it was after the 1st read. I'm not sure what to make of that.

Finally, if I comment out the call to Devel::Size::size() the System Monitor tells me:

                   :  Start   : 1st Read : undef    : 2nd Read :
    Virtual Memory : 100.9 MB : 270.0 MB : 262.0 MB : 277.9 MB :
    Resident       :   2.8 MB : 171.8 MB : 163.8 MB : 179.7 MB :
which is similar, except that the footprint is some 120 MB less !!! I don't know what that tells me about the usefulness of Devel::Size ??

I'm enquiring, because once I get to 5 million such strings in the hash, the footprint has grown to 1.0GB or so, and my machine is thrashing when I try to use the hash :-( (and <c>Devel::Size::size() takes longer and longer...)

Help !! (OK, I could go and get some more memory, 1G is not a lot in today's market...)

For completeness: perl, v5.10.0 built for x86_64-linux-thread-multi, and Linux 2.6.25.14-108.fc9.x86_64 #1 SMP Mon Aug 4 13:46:35 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux

use strict ; use warnings ; my %hash = () ; $hash{'dummy'} = undef ; printf STDERR "Single entry requires: %d Bytes\n", bytes($hash{dummy}) + ; read_it() ; undef %hash ; show(0, \%hash) ; wait_for_it('Just "undef"ed the hash') ; read_it() ; sub read_it { open my $FH, "<", "hash.txt" ; wait_for_it('About to read') ; my $e = 0 ; my $c = show($e, \%hash) ; while (<$FH>) { $hash{$_} = undef ; $e++ ; $c = ($c - 1) || show($e, \%hash) ; } ; wait_for_it('Finished Reading') ; } ; sub show { my ($e, $rh) = @_ ; printf STDERR "%8d entries: %3.1fM Bytes\n", $e, mbytes($rh) ; return 50_000 ; } ; sub mbytes { my ($r) = @_ ; bytes($r)/(1024 * 1024) ; } ; use Devel::Size () ; sub bytes { my ($r) = @_ ; return Devel::Size::size($r) ; } ; sub wait_for_it { print STDERR "$_[0]..." ; my $ch = '' ; while ($ch !~ m/\./) { sysread STDIN, $ch, 1 ; } ; } ;

In reply to Completeness of Devel::Size answers by gone2015

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.