scathlock has asked for the wisdom of the Perl Monks concerning the following question:

My Perl script have weird behaviour which I don't understand. I'm processing large structure stored as array of hashes which is growing while processing. The problem is that structure has about max 8mb when I store it on hdd, but while it is processing it takes about 130mb of ram. Why there is so big difference? The main flow of proccessing looks like:
while (...)
{
    my $new_el = Storable::dclone \%some_el;

    # ...
    # change a few things in new_el
    # ...

    push @$elements_ref, $new_el; 
}

  • Comment on Issue with cloning and large structure processing

Replies are listed 'Best First'.
Re: Issue with cloning and large structure processing
by BrowserUk (Patriarch) on Apr 10, 2010 at 10:13 UTC

    Your going to have to show us a bit more of your code--like where does %some_el come from?--because on the face of it, 130MB is too big for a hash constructed from 8MB of data.

    This creates an 8MB file of keys and values, loads them into a hash, and the total size is just 12MB:

    c:\test>perl -E"printf qq[%014d: %014d\n], $_, $_ for 1..262144" >junk +.dat c:\test>dir junk.dat 10/04/2010 11:06 8,388,608 junk.dat c:\test>perl -MDevel::Size=total_size -E"local$/; my %h = split ': ', <>; print total_size \%h;" junk.dat 12489744

    Of course, if the 8MB contains more than just a flat hash structure, then the memory requirement will be more, but 10x more is stretching the imagination a bit. So, it probably comes down to what else you are doing in your code. Real code is always more likely to result in a resolution than pseudo-code.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Ok, I will try but it's difficult because of length and logic of my code. First of all - what does my script do? It takes one line of structured string, parses that string and stores it in hash that looks like:
       $VAR1 = {
                '0' => {
                         'type' => ...,
                         'value' => ...
                       },
                '1' => {
                         'type' => ..,
                         'value' => ...
                       },
                '2' => {
                         'type' => ...,
                         'value' => ...
                       },
                '3' => {
                         'type' => ...,
                         'value' => ...
                       },
                '4' => {
                         'lvalue' => ...,
                         'rvalue' => ...,
                         'rvalue_type' => ...,
                         'type' => ...
                       }
              };
      
      The aim of script is to generate more (similar in some way) hashes. I store these hashes in array. The main action is generation and it looks like:
          my $elements_ref = shift; # reference to AoH which at the beginning contains one element
          
          %el = %{$elements_ref->[0]}; # Take first element of AoH
          my @hitlist; # Stores hash keys that indicate hash elements I will change
          
          # ...
          # Fill @hitlist with hash keys that I need. Simple for loop.
          # ...
          
          my $k = @hitlist; # Size of @hitlist. In this case it is equal to 3
          
          my $iter = Algorithm::Combinatorics::variations_with_repetition($new, $k); # Give me all variations of some elements
          
          while (my $var = $iter->next)       
          {
              my $new_el= Storable::dclone \%el;
              
              # Now, for each variation I will create (clone) original %el
              # and in $new_el I will substitute $k elements of hash indicated
              # by keys contained in @hitlist
              
              for (my $j = 0; $j < $k; $j++)
              {   
                  my $hit = $hitlist$j;
                  
                  # ...
                  $new_url->... = $var->$j;
                  # ...
              }
              
              push @$elements_ref, $new_el; # Store new element in AoH
          } 
      
      And the problem is that when at the end I store AoH in hdd it takes 8mb, but while generating it takes 130mb of ram.
        • What's in $new?
        • How are you saving the AoH to disk?
        • Are you sure you are saving everything you are generating?
        • How are you measuring the size in memory?
        • If the process is completing, why are you concerned with how much memory it took?

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Issue with cloning and large structure processing
by zwon (Abbot) on Apr 10, 2010 at 10:03 UTC

    It looks like you're creating multiple copies of %some_el. No wonder that it uses more memory.

      But I need these copies. Each copy is different. I'm copying some element of AoH, changing some values in hash, and putting that changed element in array again.

        I have no doubt that you need these copies. But why do you expect that multiple copies would use the same amount of memory as a single structure? After each loop iteration your array contains one more element, so its size is bigger

        use 5.010; use strict; use warnings; use Devel::Size qw(total_size); use Storable; my %hash = ( a => 1, b => 2, c => 3); say "Size of hash is: ", total_size(\%hash); my $ref; for (1..100_000) { push @$ref, Storable::dclone(\%hash); } say "Size of array is: ", total_size($ref); __END__ Size of hash is: 313 Size of array is: 31448769