Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, before you ask, yes, this is an exercise, and no, it's not for school. In fact what it's for barely matters any more as I've lost the opportunity I'm sure, but given that I don't like to let things lie unsolved, I could use a hand. Basically I've got a nth level deep array of keys to values which are arrays that I need to "flatten". It's easier to explain by looking at the data structure so here you go:
my $data_struct = [ 'D', [ 'C', [ 'D', [ 'C', [ 'D', 'E' ], 'E', [ 'B' ] ], 'E', [ 'B', [ 'C' ] ] ], 'E', [ 'B', [ 'C', [ 'D', 'E' ] ] ] ], 'E', [ 'B', [ 'C', [ 'D', [ 'C', 'E' ], 'E', [ 'B' ] ] ] ], 'B', [ 'C', [ 'D', [ 'C', [ 'D', 'E' ], 'E', [ 'B' ] ], 'E', [ 'B', [ 'C' ] ] ] ] ];
What I need to do is turn it into something like:
my $flattened = [ [D, C, D, C, D], [D, C, D, C, E], [D, C, D, E, B], [D, C, E, B, C], etc.. ]
I've banged my head on this for a while, any input/hints would be greatly appreciated.

Replies are listed 'Best First'.
Re: A maybe(?) interesting comp-sci type problem
by moritz (Cardinal) on May 06, 2008 at 07:55 UTC
    If I understand it correctly, you don't want to flatten the tree, but build possible combinations from it, so
    [A, [B, C]]
    becomes
    [A, B], [A, C]
    In "Higher Order Perl" are some algorithms that do similar things with iterators.

    If you don't have a copy of that book available (which would be a shame, but happens nonetheless ;-)) you should first write a sub that purges any subtrees deeper than 5 (in your example), and then a recursive one that does the expansion.

Re: A maybe(?) interesting comp-sci type problem
by ikegami (Patriarch) on May 06, 2008 at 08:36 UTC
    The structure can be defined as
    Tree : Node* Node : Term Tree | Term

    It's not the ideal structure since a subtree needs to be distinguishable from a term and we need to look ahead, but it makes the tree more compact.

    sub flatten { my ($tree) = @_; my @results; for (my $i=0; $i<@$tree; ) { my $has_subtree = ref($tree->[$i+1]); if ($has_subtree) { my $term = $tree->[$i+0]; my $subtree = $tree->[$i+1]; push @results, map { [ $term, @$_ ] } @{ flatten($subtree) }; $i += 2; } else { my $term = $tree->[$i+0]; push @results, [ $term ]; $i += 1; } } return \@results; } print("@$_\n") for @{ flatten($data_struct) };

    Update: Small code reformat to resemble the tree structure description.

      Hmm, this returns 13 x 5 = 65 elements from 35 original values

        What are you implying? I don't think anything can be derived from 35 since it's not a full tree.

        I count 13 leaves, so there should be 13 results.
        And every leaf is at depth 5, so each result should be of length 5.

Re: A maybe(?) interesting comp-sci type problem
by grizzley (Chaplain) on May 06, 2008 at 09:54 UTC
    Couldn't stop thinking about this approach... Please, do not consider it as real world solution :)
    use Data::Dumper; $_ = Dumper $data_struct; s/[\s',;]//g; s/.*=//; s/(\w)(?=\w)/$1,/g; print; # convert A[B,C] -> [AB,AC] while(s!(\w+)\[([\w,]+)\]!'['.join(',',map$1.$_, split/,/,$2).']'!e # convert [[[AB,AC]]] -> [AB,AC] || s!\[(\[[\w,]+\])\]!$1! # convert [AB,AC][DE] -> [AB,AC,DE] || s!\[([\w,]+)\]\[([\w,]+)\]![$1,$2]!) { # see what happens after every step print }
Re: A maybe(?) interesting comp-sci type problem
by shmem (Chancellor) on May 06, 2008 at 09:18 UTC
    sub flatten { my @ret = (); while (@_) { my $k = shift; if (ref $_[0]) { my $v = shift; push @ret, [ $k, @$_ ] for flatten(@{$v}); } else { push @ret, [ $k ]; # push @ret, [ shift ] while @_ and (! $_[1] or ! ref $_[1] +); } } @ret; } use Data::Dumper; print Dumper([flatten(@{$data_struct})]);

    update: flattened code (see below :)

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}

      Two quick notes:

      Your inner while is redundant with your outer while. You can get rid of
      push @ret, [ shift ] while @_ and (! $_[1] or ! ref $_[1]);

      It wouldn't have hurt to note that your function destroys the original data structure.

      Update: Me bad.

        It wouldn't have hurt to note that your function destroys the original data structure.

        No, it doesn't. It doesn't operate directly on $_[n], but makes copies. The original is untouched.

        --shmem

        _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                      /\_¯/(q    /
        ----------------------------  \__(m.====·.(_("always off the crowd"))."·
        ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re: A maybe(?) interesting comp-sci type problem
by jdporter (Paladin) on May 06, 2008 at 14:09 UTC

    A pure functional recursive approach:

    sub flatten { map { my $k = $_[0][$_*2]; my $v = $_[0][$_*2+1]; !defined $v ? [$k] : # one scalar leaf !ref $v ? ([$k],[$v]) : # two scalar leafs map [$k,@$_], flatten($v); # scalar key and arrayref val } (0..$#{$_[0]}/2) # iterate over input list pairwise } my @a = flatten( $data_struct ); print "@$_\n" for @a;
    A word spoken in Mind will reach its own level, in the objective world, by its own weight
Re: A maybe(?) interesting comp-sci type problem
by rovf (Priest) on May 06, 2008 at 08:03 UTC
    Here something which would flatten your array completely; putting it into a "matrix" with 5 columns is left as an exercise.

    Warning: Code not tested.

    # Call as: flatten([...]) sub flatten { my $aref=shift; die "not an array ref" unless ref($aref) eq 'ARRAY'; @result=(); foreach my $element (@$aref) { push @result, (ref($element) eq 'ARRAY' ? flatten($element) : $element); } @result; }
    -- 
    Ronald Fischer <ynnor@mm.st>

      Besides the fact that this does not work (it will if you localise @result with my)..... As they say, any recursive solution can be written as in iterative one. Taking advantage of the fact you can push onto a list you are iterating over:

      sub flatten_recursive { my $tree = shift; my @result; for my $node (@$tree) { push @result, ref $node ? flatten_recursive($node) : $node; } @result; } print join ' ', flatten_recursive($data_struct), "\n"; sub flatten_iterative { my @tree = ($_[0]); my @results; for my $node (@tree) { (ref $node) ? push @tree, @$node : push @results, $node; } return \@results; } print join ' ', @{flatten_iterative($data_struct)}, "\n";

      You're assuming that all leaves must be at the same, known depth (5).

      If that's a valid assumption, natatime from List::MoreUtils can do the "matrification".

Re: A maybe(?) interesting comp-sci type problem
by runrig (Abbot) on May 09, 2008 at 07:51 UTC
    In the spirit of perhaps using a screwdriver to pound a nail (or maybe just putting arbitrary restrictions on yourself...sort of like composing a 12-tone piece), I wanted to see if I could use RFC: DBIx::Iterator to solve this. The function in the module returns an iterator that returns a hashref, so I wrapped the iterator returned from that to return an array (so this makes an iterator that returns a row of data at a time, so it is not quite what the OP asked for...but you can easily wrap the iterator to do that). Here is what I came up with:
    #!/usr/bin/perl use strict; use warnings; use DBIx::Iterator qw(mk_iterator list_iterator); my $data_struct = [ ...snip...(see OP) ]; my $iter = flatten($data_struct); while ( my @row = $iter->() ) { print "@row\n"; } sub flatten { my $data = shift; my $iter = mk_iterator(_flatten(0, $data)); sub { my $data = $iter->() or return; return @$data{sort { $a <=> $b } keys %$data}; } } sub _flatten { my ($i, $data) = @_; my @tmp = @$data; my @groups = (); while (@tmp) { my @nxt_group; push @nxt_group, shift @tmp while @tmp > 2 and !ref($tmp[1]); push @nxt_group, @tmp and @tmp=() if @tmp <= 2 and !grep ref($_), @tmp; if ( @nxt_group ) { push @groups, [ mk_list_iter($i, @nxt_group) ]; } if ( @tmp ) { my @nxt_grp; push @nxt_grp, mk_list_iter($i, shift(@tmp)); push @nxt_grp, _flatten($i+1, shift(@tmp)) while @tmp and ref($tmp[0]); push @groups, [ @nxt_grp ]; } } return @groups; } sub mk_list_iter { my ($i, @list) = @_; sub { list_iterator( LIST => [ map { [$_] } @list ], SELECT => [ $i ], @_ ); } }