usr345 has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I have a test task for job: having a nested data structure of arrays, hashes and scalars, create a function, that will delete the empty elements, i.e. elements, not containing any scalars.

I created such function, and it works. But I am asking: "is there a more elegant way of implementation?". The function recursively loops through the data structure, removes the empty elements and if there are no elements in the current node - returns 0; The algorithm for arrays and hashes is the same.

#!/usr/bin/perl -w use strict; use Data::Dumper; my @array = ( [1], [], {1 => 2, 2 =>{}} ); clean(\@array); print Dumper(\@array); sub clean { my $ref = shift; my $ref_name = ref($ref); if ($ref_name eq "SCALAR") { return 1; } elsif ($ref_name eq "ARRAY") { my $n = scalar(@$ref); for(my $i = 0; $i < $n; $i++) { my $result = clean($ref->[$i]); if ($result == 0) { delete($ref->[$i]); } } $n = scalar(@$ref); if ($n == 0) { return 0; } return 1; } elsif ($ref_name eq "HASH") { while(my ($key, $value) = each( %$ref)) { my $result = clean($value); if ($result == 0) { delete($ref->{$key}); } } my $n = scalar(keys %$ref); if ($n == 0) { return 0; } return 1; } elsif ($ref_name eq "REF") { die "incorrect reference submitted to function clean\n"; } else { clean(\$ref_name); } }
as the result the script will print:
$VAR1 = [ [ 1 ], undef, { '1' => 2 } ];

Replies are listed 'Best First'.
Re: Perl task function
by BrowserUk (Patriarch) on Sep 08, 2010 at 21:25 UTC

    One problem is that using delete on arrays leave behind an undef 'placeholder' which mean your scalar @$ref test fails, hence the undef in your result. The fix is to use splice for arrays.

    I've always like dispatch tables for this kind of mutual recursion:

    #!/usr/bin/perl -w use strict; use Data::Dumper; my %dispatch = ( SCALAR => sub{ return 1; }, ARRAY => sub { my $ref = shift; clean( $ref->[ $_ ] ) or splice @$ref, $_, 1 for reverse 0 .. $#$ref; return 0 unless @$ref; }, HASH => sub { my $ref = shift; my( $key, $val); clean( $val ) or delete $ref->{ $key } while ( $key, $val ) = each %$ref; return 0 unless scalar keys %$ref; }, REF => sub{ my $ref = shift; die "REF: $ref not allowed"; }, CODE => sub{ my $ref = shift; die "CODE: $ref not allowed"; }, '' => sub{ return 1; }, ); sub clean { return $dispatch{ ref $_[0] }->( $_[0] ); } my @array = ( [1], [], { 1 => 2, 2 =>{} } ); clean(\@array); print Dumper(\@array); __END__ $VAR1 = [ [ 1 ], { '1' => 2 } ];

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Perl task function
by ikegami (Patriarch) on Sep 08, 2010 at 22:49 UTC

    I like the following. It can even clear the variable you pass as the arg to clean().

    sub clean { our $node; local *node = \$_[0]; # Alias return if !defined($node); my $reftype = ref($node); return if !$reftype; if ($reftype eq 'REF' || $reftype eq 'SCALAR') { clean($$node); $node = undef if !defined($$node); } elsif ($reftype eq 'ARRAY') { clean($_) for @$node; @$node = grep defined, @$node; $node = undef if !@$node; } elsif ($reftype eq 'HASH') { clean($_) for values(%$node); delete @{$node}{ grep !defined($node->{$_}), keys(%$node) }; $node = undef if !keys(%$node); } }

    For each reference type, it cleans each item in the referenced container, cleans the referenced variable, then cleans the reference itself. Each of these steps involves syntax that's specific to the type of reference, so they can't be combined.

    Tester:

    use strict; use warnings; use Data::Dumper qw( Dumper ); sub dumper { local $Data::Dumper::Useqq = 1; local $Data::Dumper::Terse = 1; local $Data::Dumper::Indent = 0; return Dumper($_[0]); } for ( [ [1], [], {1 => 2, 2 =>{} } ], [ undef, [ undef ], { x => undef }, \undef, \\\undef ], ) { my $d = $_; print(dumper($d), "\n"); clean($d); print(dumper($d), "\n"); print("\n"); }
    [[1],[],{1 => 2,2 => {}}] [[1],{1 => 2}] [undef,[undef],{"x" => undef},\undef,\\$VAR1->[3]] undef

    Update: Replaced a "!" that wandered off.

      Hello, BrowserUk. Your solution is great, but there is one error: when I splice the array it's size decreases, so the cycle doesn't check the next element. Here is the fixed solution. I changed your syntax a little - you prefer a compact style, while I try to make the code more comprehensible - even placing some extra code.
      my %dispatch = ( SCALAR => sub{ return 1; }, ARRAY => sub { my $ref = shift; my $n = scalar(@$ref); for(my $i = 0; $i < $n; ++$i) { my $result = clean($ref->[ $i ]); if($result == 0) { splice(@$ref, $i, 1); $i--; $n--; } } unless(@$ref) { return 0; } }, "HASH" => sub { my $ref = shift; while(my ($key, $value) = each( %$ref)) { my $result = clean($value); if ($result == 0) { delete($ref->{$key}); } } my $n = scalar(keys %$ref); if ($n == 0) { return 0; } return 1; }, REF => sub{ my $ref = shift; die "REF: $ref not allowed"; }, CODE => sub{ my $ref = shift; die "CODE: $ref not allowed"; }, '' => sub{ return 1; }, );

      ikegami, I don't understand your solution. There is no need to undef the scalars, and it didn't delete the empty arrays. Thanks for help, though.

        ikegami, I don't understand your solution.

        Which part? I wrote it to be extremely straightforward. Each line is completely independent of the others.

        There is no need to undef the scalars,

        I think you mean there is no need to get rid of references to undef. If so, then just get rid of the following:

        if ($reftype eq 'REF' || $reftype eq 'SCALAR') { clean($$node); $node = undef if !defined($$node); } els

        (Like I said, very modular.)

        it didn't delete the empty arrays

        It both empties arrays (removes any undefs from them), and deletes references to empty arrays. I even showed that it does. Here's a couple more tests:

        [[[],[],["b"]],"a"] [[["b"]],"a"] [[[],[],[]],"a"] ["a"]

        Could you give a case that breaks?

        my $n = scalar(keys %$ref); if ($n == 0) { return 0; } return 1;

        I find it weird that clean() returns false when the node is clean (empty). Either way, your code could be simplified to

        return keys(%$ref) ? 1 : 0;

        or even just

        return keys(%$ref);
        there is one error: when I splice the array it's size decreases, so the cycle doesn't check the next element.

        There is no error in my implementation. It does all elements, if you start from the end and work backward. Hence why my loop uses reverse:

        for reverse 0 .. $#$ref;

        Your forward running C-style loop, using splice and requiring that you decrement $i & $n after deletions is rather clumsy and unperlish by comparison. But if that's what you're comfortable with, stick with it.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.