in reply to Massive Perl Memory Leak

Don't use local %datahash, use my %datahash.

Unless you really really understand what the difference is between these two keywords, 99.99% of the time, you want my.

The local keyword is a way to temporarily shadow the original value so changes aren't kept, and then automatically restore that original value later.

While I see that you're using the structure in both routines, it seems a bit odd that you're trying to work with the global %datahash from both routines (and others). I think you might want to check out the use of references, so you can work with data structures more generally instead of having multiple routines try to protect their values and effects on these global variables.

--
[ e d @ h a l l e y . c c ]

Replies are listed 'Best First'.
Re^2: Massive Perl Memory Leak
by wagnerc (Sexton) on Jun 11, 2007 at 17:43 UTC
    Thanks for getting back to me. I'm using the local %datahash construct so I can avoid gotchas with passing references (but it seems I've run into another gothcha somewhere else). %datahash is essentially just a thread global and the local is there to automatically garbage collect it as the forever loop restarts. I can't use my because the variable wouldn't be visible to the descendant subroutines. Now correct me if I'm wrong, but isn't %datahash garbage collected the moment the forever loop loops? It's not creating *new* %datahashes and keeping the *old* %datahash's stashed away right? I'm manually walk deleting everything out of %datahash at the end of the loop so either way it shouldn't be building up memory.
      $foo = 5; sub do_something { print $foo, $/; } sub main { do_something(); { local $foo = 6; do_something(); } do_something(); } main();
      The above code is a typical use of local. Internally, it is conceptually equivalent to the following code (shown for the innermost block only). You'd get 5, 6, 5 printed.
      { my $unnamed_backup_of_old_value_of_foo = $foo; $foo = 6; do_something(); $foo = $unnamed_backup_of_old_value_of_foo; }

      The actual storage of $foo is the first statement. It is what gets changed whenever you see assignments being done.

      This of course can get very big and hairy with a huge hash of data to "backup" and "restore" in the equivalent %unnamed_backup_of_old_value_of_datahash. Also, deleting items from the local hash would not have much bearing, since the whole hash will just get tossed out and restored from the backup.

      If you're doing local from multiple threads, well, you can see where your memory is going. Secondly, I can quite easily imagine a race condition where your main shared global hash is getting "restored" in the wrong order, leading to pandemonium.

      --
      [ e d @ h a l l e y . c c ]

        %datahash is initialized as nul since it's not used until we're already inside the forever loop. So when local reinstantiates it at each pass it "restores" it's value to that of the nul list, (). In my usage, I could replace it with %datahash = (); and the functionality would be identical. %datahash is also an unshared variable. No other threads can access it. So so long as Perl's innards are clean, no other thread can tamper with %datahash. I'm using the term thread global to mean it's global only within that one thread. To get program globals u have to share() them.

        Does anybody know of a way to "see" data/memory that doesn't have a varname pointing to it? To find out what "anonymous storage" is in use.

      I can't answer the garbage collection issue except to say what you probably already know: Perl deletes objects' memory when it is no longer referenced.

      However using global variables to "pass" information between subs is bad, bad, bad! Instead pass a reference to the descendant subs so it is clear where information is being used and possibly modified. If you find you are passing a lot of parameters consider using light weight object oriented techniques by simply blessing a hash into the current package then calling the descendant subs on the object:

      my $obj = bless {datahash => \%datahash, ...}; $obj->sub1 (); $obj->sub2 (); ... sub sub1 { my $self = shift; $self->sub3 (wibble => 3); } ... sub sub3 { my ($self, %params) = @_; for my $key (keys %{$self->{datahash}}) { my $value = $self->{datahash}{$key} * $params->{wibble}; ... } }

      and if you have trouble with getting references going ask! Writing nasty code to avoid learning a new and valuable technique will not reward you over the long run, and by the sound of it not even in the short run. You may find References quick reference helps.


      DWIM is Perl's answer to Gödel
Re^2: Massive Perl Memory Leak
by wagnerc (Sexton) on Jun 11, 2007 at 19:45 UTC
    Do any of u see any problem with my hash usage? Specifcally:

     %{$datahash{"devinfo"}} = %devinfo; I'm curious as to how the deep nature of %devinfo is copied over to the %datahash branch. As in do any refs to the original var survive? Or is that a 100% clean copy with no strings attached.

    Would either of these be equivalent or better syntax? :

    $datahash{"devinfo"} = \%devinfo; $datahash{"devinfo"} = %devinfo; # this must be wrong

      Consider:

      use strict; use warnings; use Data::Dump::Streamer; my %devinfo = (1 => {a => 'apple', b => 'orange'}); my %datahash = (devinfo => {}); %{$datahash{"devinfo"}} = %devinfo; Dump (\%devinfo, \%datahash, $datahash{devinfo});

      Prints:

      $HASH1 = { 1 => { a => 'apple', b => 'orange' } }; $HASH2 = { devinfo => 'A: $HASH3' }; $HASH3 = { 1 => $HASH1->{1} }; alias_hv(%$HASH2, 'devinfo', $HASH3);

      The copy is a shallow copy. If any of the values of %devinfo are references then the copy simply duplicates the reference. You may find Storable's dclone helps if you are looking for a clone of the data.


      DWIM is Perl's answer to Gödel