cosmicperl has asked for the wisdom of the Perl Monks concerning the following question:

Hi All,
  I've hit a problem that has me a bit stumped and I'm not entirely sure which way to go towards a solution. The problem is I've got a sub routine that calls itself, and in doing so the initial my of my variables seems to loose it's effect.
  I'm aiming and scanning through and hash reference and deleting keys to hash and array references that are empty.
My test code:-
my $hashvar = { emp => {}, hemp => { splay => {}, hay => { go => 1, ho => {}, }, tay => [], may => [ 'way' ], }, }; sub HashClean { my ($hashref) = @_; if (ref ($hashref) eq "HASH") { foreach my $key (keys %$hashref) { print "W $key\n"; if (ref ($hashref->{$key}) eq "HASH") { print "HR $key\n"; if (keys (%{$hashvar->{$key}})) { print "HC $key\n"; &HashClean($hashref->{$key}); }#if else { print "HD $key\n"; delete $hashref->{$key}; }#else }#if if (ref $hashref->{$key} eq "ARRAY") { print "AR $key\n"; if ($#{$hashvar->{$key}}) { print "AC $key $#{$hashvar->{$key}}\n"; &HashClean($hashref->{$key}); }#if else { print "AD $key\n"; delete $hashref->{$key}; }#else }#if }#foreach }#if else { foreach my $key (@$hashref) { if (ref ($hashref->[$key]) eq "HASH") { if (keys (%{$hashvar->[$key]})) { &HashClean($hashref->[$key]); }#if else { # delete $hashref->[$key]; }#else }#if if (ref $hashref->[$key] eq "ARRAY") { if ($#{$hashvar->[$key]}) { &HashClean($hashref->[$key]); }#if else { # delete $hashref->[$key]; }#else }#if }#foreach }#else }#sub print Data::Dumper->Dump([$hashvar],[qw (::hashvar)]); &HashClean($hashvar); print Data::Dumper->Dump([$hashvar],[qw (::hashvar)]);

I know all the prints are a bit messy but I've put them in to try and figure out what it happening. Output is:-
$::hashvar = { 'hemp' => { 'may' => [ 'way' ], 'tay' => [], 'hay' => { 'ho' => {}, 'go' => 1 }, 'splay' => {} }, 'emp' => {} }; W hemp HR hemp HC hemp W may AR may AC may -1 W tay AR tay AC tay -1 W hay HR hay HD hay W splay HR splay HD splay W emp HR emp HD emp $::hashvar = { 'may' => [], 'hemp' => { 'may' => [ 'way' ], 'tay' => [] }, 'hay' => {}, 'tay' => [], 'splay' => {} };

So it appears that when it re-calls HashClean a new $hashvar isn't created and instead it merges the keys with the old one. Not what I was expecting... and I'm afraid I don't know how to get it to do what I was expecting :S

Help as always, much appreciated

Lyle

Update: DOH! Seems within the sub I'd accidently typed hashvar where I meant hashref. Changing it to hashref fixed things and it works as I originally expected. Thanks guys
Update 2 I found that when an array or hash contained only empty arrays or hashes you were left with empty hashes (as it wasn't empty before it's empty hash/array contents were removed). Not fully understanding GrandFathers code I ended up writing a mod of my original code so that it checks if anything was deleted and reruns each level if something was:-
Update 3 I found if you had empty structure within empty structures, within empty ones again, some of the initial ones were being left behind. This update returns whether it deleted anything to the previous level which re-runs if need.
sub HashClean { my ($hashref) = @_; my $deletes = 0; if (ref ($hashref) eq "HASH") { foreach my $key (keys %$hashref) { if (ref ($hashref->{$key}) eq "HASH") { if (keys (%{$hashref->{$key}})) { $deletes += &HashClean($hashref->{$key}); }#if else { delete $hashref->{$key}; $deletes++; }#else }#if if (ref $hashref->{$key} eq "ARRAY") { if ($#{$hashref->{$key}} >= 0) { $deletes += &HashClean($hashref->{$key}); }#if else { delete $hashref->{$key}; $deletes++; }#else }#if }#foreach }#if else { my $arraynum = 0; for (my $arraynum = 0; $arraynum <= $#$hashref; $arraynum++) { if (ref ($hashref->[$arraynum]) eq "HASH") { if (keys (%{$hashref->[$arraynum]})) { $deletes += &HashClean($hashref->[$arraynum]); }#if else { delete $hashref->[$arraynum]; $deletes++; }#else }#if if (ref $hashref->[$arraynum] eq "ARRAY") { if ($#{$hashref->[$arraynum]} >= 0) { $deletes += &HashClean($hashref->[$arraynum]); }#if else { delete $hashref->[$arraynum]; $deletes++; }#else }#if }#for }#else &HashClean($hashref) if $deletes; return $deletes; }#sub

Replies are listed 'Best First'.
Re: Variable scoping when a sub calls itself??
by moritz (Cardinal) on Apr 15, 2008 at 22:06 UTC
    It might be very confusing that $hashvar is passed as a parameter to your sub, and at the same time it is used under its original name, thus referencing a part (or subset) of the data structure with a differenct name.

    Don't do that. Don't mix $hashvar with $hashref.

    Also don't call subs with a leading '&', it might have unexpected behaviour.

    Update: Here is a small script that does what you want, by constructing a new copy of the data structure instead of deleting items:

    use strict; use warnings; use Data::Dumper; use Scalar::Util qw(reftype); my $hashvar = { emp => {}, hemp => { splay => {}, hay => { go => 1, ho => {}, }, tay => [], may => [ 'way' ], }, }; sub clean { my $ref = shift; return $ref unless ref $ref; if (reftype($ref) eq 'HASH'){ my %result; for (keys %$ref){ my $tmp = clean($ref->{$_}); $result{$_} = $tmp if (defined $tmp); } if (keys %result){ return \%result; } else { return; } } elsif (reftype($ref) eq 'ARRAY'){ return @$ref ? $ref : undef; } } print Dumper($hashvar, clean($hashvar)); __END__ $VAR1 = { 'hemp' => { 'may' => [ 'way' ], 'tay' => [], 'hay' => { 'ho' => {}, 'go' => 1 }, 'splay' => {} }, 'emp' => {} }; $VAR2 = { 'hemp' => { 'may' => $VAR1->{'hemp'}{'may'}, 'hay' => { 'go' => 1 } } };

    I'm too tired to tell if it really works, so check it yourself ;-)

      Thanks for the tips. Fixed it now. And don't worry, I always use () with & so as not to get strange behavior. Just like being able to quickly distinguish between my routines and perl's.

        And don't worry, I always use () with & so as not to get strange behavior.

        & also disables prototypes.

        use strict; use warnings; sub my_splice(\@$;$@) { my ($array, $start, $length, @insert) = @_; return splice(@$array, $start, $length, @insert); } { my @array = qw( a b c ); splice(@array, 1, 1, qw( d e )); print(@array, "\n"); # adec } { my @array = qw( a b c ); my_splice(@array, 1, 1, qw( d e )); print(@array, "\n"); # adec } { my @array = qw( a b c ); &my_splice(@array, 1, 1, qw( d e )); print(@array, "\n"); # Can't use string ("a") as an ARRAY ref whil +e "strict refs" in use at !.pl line 6. }

        Now, you could argue that prototypes should be avoided, but they are used by many modules.

Re: Variable scoping when a sub calls itself??
by GrandFather (Saint) on Apr 15, 2008 at 22:13 UTC

    but $hashVar is global. Why should HashClean recreate it? If you want a new instance of $hashVar every time you enter HashClean then $hashVar needs to be local to HashClean.

    It seems that what you are trying to do is recursively traverse a mixed hash and array structure and clean it up. Maybe the following will help:

    use strict; use warnings; sub clean { my $ref = shift; if ('HASH' eq ref $ref) { clean ($ref->{$_}), $ref->{$_} = undef for keys %$ref; } elsif ('ARRAY' eq ref $ref) { clean ($_), $_ = undef for @$ref; } } my $hashvar = { emp => {}, hemp => {splay => {}, hay => {go => 1, ho => {},}, tay => [], may => [ +'way'],}, }; clean ($hashvar); $hashvar = undef;

    Update: rereading the OP I see that deleting the whole structure is not quite what you want, but the code above should still be a good starting point for getting to where you want to go. In fact the following may just do the trick:

    use strict; use warnings; use Data::Dump::Streamer; sub clean { my $ref = $_[0]; if ('HASH' eq ref $ref) { clean ($ref->{$_}) and delete $ref->{$_} for keys %$ref; return ! keys %$ref; } elsif ('ARRAY' eq ref $ref) { clean ($ref->[$_]) and delete $ref->[$_] for reverse 0 .. $#$r +ef; return ! @$ref; } else { return ! defined $ref; } } my $hashvar = { emp => {}, hemp => {splay => {}, hay => {go => 1, ho => {},}, tay => [], may => [ +'way'],}, }; clean ($hashvar); Dump ($hashvar);

    prints:

    $HASH1 = { hemp => { hay => { go => 1 }, may => [ 'way' ] } };

    Tip for the future - show what you expect as well as what you get.


    Perl is environmentally friendly - it saves trees
      Don't you think it cleans too much? ;-)

      cosmicperl wants to delete pairs from hashes where the values are empty hash or array refs. After your sub runs, $hashvar looks like this:

      $hashvar = { 'hemp' => undef, 'emp' => undef };

      No need for recursion to achieve that, perl's garbage collector does the rest when you just delete the hash values ;-)

        I noticed that that was not what OP was after on re-reading the node (cf my update). My initial assumption was that OP needed to forcibly delete each node in the structure to get around leaks due to circular references or some such.


        Perl is environmentally friendly - it saves trees
      Thanks for your update. Looks a lot neater than my code (which did work when I sorted out the variable names). And yes I did want all the info, just not the empty hashes and arrays. I'll make sure to post what output I want in the future.

      Lyle