wertert has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

I have a data structures made up of HOHOH etc which represents a tree. I am trying to find a way of removing an entire 'layer' of the tree while keeping the other layers intact and still linked to each other. As an example I may start with

$W={ 'A' => { 'L' => { 'D'=>99, 'X'=>99, }, 'C' =>{ 'E'=>99 }, }, 'X' => { 'Q'=>{ 'O'=>99, }, 'T'=>{ 'U'=>99, }, }, };
and I want to remove all of 'level' 2 ( ie L,C,Q,T ) leaving me with;
$X={ 'A' => { 'D'=>99, 'X'=>99, 'E'=>99 }, 'X' => { 'O'=>99, 'U'=>99, }, };
Doesn't sound too bad ? The only complication is the tree can have any number of layers so I've been leaning towards various recursive ideas and using dclone but I'm not getting anywhere fast. I'm also thinking that it's probably easier to create a new hash instead of changing the source, unlike my lame effort below.
printTree($W); # just dumps HOHOH - could use data::dumper - remove +d remove1($W,2); # ie remove layer 2 ( counting from 1 ) print "--------------------\n"; printTree($W); sub remove1 { my($refToHash)=shift; my($stage)=shift; foreach $a (keys %{$refToHash}) { if($depth==$stage-1) # then we've hit the level before the lev +el to be removed/ { print "removing $a\n"; foreach $b (keys %{$refToHash->{$a}}) { $refToHash->{$b} = $refToHash->{$a}->{$b}; delete($refToHash->{$a}); } } else { $depth++; remove1($refToHash->{$a},$stage); $depth--; } } }
output is;
perl delete_mid2.pl A C E L D X X Q O T U removing C removing L removing Q removing T -------------------- A D E X X O U
so sort of ok but I'm really not happy with it. Removing layer 1 is also ok but layer 3 does not work. Can anyone suggest a better way of doing this. I've been staring at this (from Randal L. Schwartz, Perl hacker) for the last 3 hours and trying to think of a way of making it layer sensitive ...
sub deep_copy { my $this = shift; if (not ref $this) { return $this; } elsif(ref $this eq "HASH") { return {map { $_ => deep_copy($this->{$_}) } keys %$this}; } }
I would really appreciate ANY suggestions. Thanks in advance. Jeff

Replies are listed 'Best First'.
Re: recursive hash layer removal question
by Roy Johnson (Monsignor) on May 10, 2005 at 17:55 UTC
    This does an in-place edit of the underlying hash. With your $W above, you could flatten level 0 or 1. Flattening level 2 leaves it unchanged. Having other reference types intermingled will induce barfing.
    sub do_for_level { my ($href, $level, $code) = @_; if ($level == 0) { %$href = map $code->($_, $href->{$_}), keys %$href if ref $href; } else { map do_for_level($_, $level-1, $code), values %$href; } } sub flatten_level { my ($key, $href) = @_; ref $href ? map {$_ => $href->{$_}} keys %$href : ($key, $href); } use Data::Dumper; do_for_level($W, 2, \&flatten_level); print Dumper($W);

    Caution: Contents may have been coded under pressure.
Re: recursive hash layer removal question
by Animator (Hermit) on May 10, 2005 at 17:27 UTC

    My inital thoughts: (might update it later)

    Change your foreach loop into this:

    if (ref $refToHash->{$a} eq 'HASH') { foreach $b (keys %{$refToHash->{$a}}) { $refToHash->{$b} = $refToHash->{$a}->{$b}; } } delete($refToHash->{$a});

    The problems with your current code:

    • You are dereferencing a hash element without checking if the element is a hash referene (use the ref-function)
    • You are deleting the original element ($refToHash->{$a}) inside the foreach loop, this is not a good idea...

    Note, if you remove the last layer with this code then the previous layer will point to empty hash reference... is that what you want? (add an else statement to the 'if (ref ..)' if it isn't

      Thanks for the fast reply. I'm sorry I normally check for hash reference. Can't understand why I didn't here i must have recursion brain ache. Example is the printTree function removed from the original post.
      sub printTree { my($a)=shift; my(@l); foreach $b (sort keys %{$a}) { if(ref($a->{$b}) eq 'HASH') { $l[$depth]=$b; print "@l\n"; $depth++; printTree($a->{$b}); $depth--; } else { $l[$depth]=$b; print "@l\n"; } } }
Re: recursive hash layer removal question
by Forsaken (Friar) on May 10, 2005 at 21:27 UTC
    Just out of curiosity, since I have no idea what the real data looks like, isn't there the possibility of running into a situation where you end up with duplicate keys, hence losing data? ie, if you were to have a tree like this:
    my %tree = ( 'x' => { 'a' => 99, 'b' => 99, 'c' => 99, } 'y' => { 'c' => 99, 'd' => 99, 'e' => 99, } )
    you'd end up losing the duplicate 'c' value. Is this a situation that might occur and needs to be anticipated somewhat?(gotta love making things more complicated :P)

    Remember rule one...

      Hi Forsaken real data looks like this
      $VAR1 = { 'EXCLUSTER01' => { 'IBFFTDSD72N01A' => { 'D280_3_TH' => { 'A' => { 'TH_EXCLUSTER01_03' => 999, 'TH_EXCLUSTER01_02' => 999, 'TH_EXCLUSTER01_01' => 999 } } } }, blar blar
      It represents the clusters and hosts within a storage array network ie group->hosts->disk arrays->disk controller->volumes. I'm writing a sysadm tool which allows the user to 'drill' into this recursive information. I don't think repetition within the tree is a problem
        Ok - new idea. Am I re-inventing the wheel here ? I really want to spend my time focusing on the tool i'm writing not developing and load of recursive functions for tree manipulation. Can anyone suggest a CPAN module that I should consider using ?
Re: recursive hash layer removal question
by TedPride (Priest) on May 10, 2005 at 19:42 UTC
    Oops - was decrementing twice. Thanks for the quick notice, Roy Johnson.
    use strict; use warnings; use Data::Dumper; my $W={ 'A' => { 'L' => { 'D'=>99, 'X'=>99, }, 'C' =>{ 'E'=>99 }, }, 'X' => { 'Q'=>{ 'O'=>99, }, 'T'=>{ 'U'=>99, }, }, }; remove($W, 2); print Dumper($W); sub remove { my ($p, $d) = @_; if ($d-- > 1) { remove($_, ($d)) for values %$p; # Had second decrement here } else { my %newhash; for (values %$p) { @newhash{keys %$_} = values %$_; } $_[0] = \%newhash; } }
      Thanks for all your help. Lots of good stuff here. At the risk of appearing useless I may need some help with code provided by TedPride ( very cool BTW ).
      sub remove { my ($p, $d) = @_; if ($d-- > 1) { remove($_, ($d)) for values %$p; # Had second decrement here } else { my %newhash; for (values %$p) { @newhash{keys %$_} = values %$_; } $_[0] = \%newhash; } }
      I've not seen one of these before. It's an hash but it looks like an array ie '@'
      for (values %$p) { @newhash{keys %$_} = values %$_; } $_[0] = \%newhash;
      What is going on here then ? Checked out my cookbook etc but can't find a reference to this syntax. One other slight complication is the data can actually look like this;
      $W={ 'A' => { 'L' => { 'D'=>99, 'X'=>99, }, 'C' =>{ 'E'=>99 }, }, 'X' => { 'Q'=>{ 'O'=>99, }, 'T'=>{ 'U'=>99, }, }, 'W' => 99, };
      Now when we collapse level 2 ie LCQT I don't want to change the value for 'W'. So only drop values on the target level if they are HASH. So really it's a mix of H,HOH and HOHOH etc. Thanks again for your help. Jeff
        It's a hash slice assignment (in other words, it's not a hash, it is an array). Basically you're saying, here is a list of keys and for each of these keys assign these values (since 'keys' and 'values' return lists). I can't find any particular documentation on it, but it does work :)