spx2 has asked for the wisdom of the Perl Monks concerning the following question:

I've tried to implement a tree with variable number of children for each
node.The data for the tree is written as tuples of the form (child,parent)
and there is code for parsing that data and entering it into the tree
First a root for the tree is found,that is inserted first in the tree.
After that all the other nodes are inserted.
The code works,but I would like your oppinion on it and suggestions on how to optimise it.
#!/usr/bin/perl #===================================================================== +========== # # FILE: test.pl # # USAGE: ./test.pl # # DESCRIPTION: # # OPTIONS: --- # REQUIREMENTS: --- # BUGS: --- # NOTES: --- # COMPANY: # VERSION: 1.0 # CREATED: 01/27/2008 03:11:51 AM EET # REVISION: --- #===================================================================== +========== use strict; use warnings; use Data::Dumper; my $tree; my $s = "f b, b c, a c, e b, d a, g a"; my @tuples = map{ my @tuple=split ' ',$_; { child => shift @tuple, parent => shift @tuple, }; } split ',',$s; sub insert_node { my ($root,$parent,$child) = @_; foreach my $node (@$root) { foreach my $child_of_root (@{ $node->{children} }) { my $retval = insert_node([$child_of_root],$parent,$child); return $retval if $retval eq 1; }; if($node->{node} eq $parent){ # if we've found our parent then # put the node to be inserted as its child push @{ $node->{children} }, { node =>$child, children=>[], }; return 1; }; if($node->{node} eq $child) { # if the current node is the child of the node to be inser +ted # then replace the name of the current node with the paren +ts name # and put the old current node as its child my $old_node = { node =>$node->{node}, children =>$node->{children} }; $node->{node} = $parent; $node->{children} = [$old_node]; return 1; } }; #after the loop has ended if we arrive here #it means that there is no right place for the node to be #so we return it and append it to the tuples if we haven't #found a right place for it this moment we will in the future #but until the program ends return 0; } sub find_root { # this finds the root of the tree by progressively going from pare +nt to its parent and so forth my $count = scalar @tuples; my $root=$tuples[0]->{parent}; my $temp; $temp->{$_->{child}} = $_->{parent} foreach @tuples; while(exists $temp->{$root}) { $root=$temp->{$root}; } return $root; } $tree= [ { node => find_root(),children=>[]} ]; #build the tree with o +nly one node,namely the root foreach(@tuples) {#add progressively nodes to the tree my $retval = insert_node($tree,$_->{parent},$_->{child}) ; warn $retval; unless($retval) { #because insert_node did not find a proper place for that node #we pre-pend it to the @tuples array for later insertion push @tuples,$_; }; }; print Dumper($tree);

Replies are listed 'Best First'.
Re: tree with variable number of children
by GrandFather (Saint) on Jan 27, 2008 at 20:01 UTC

    I'd be inclined to use hashes instead of arrays. The code simplifies somewhat:

    use strict; use warnings; use Data::Dump::Streamer; my $s = "f b, b c, a c, e b, d a, g a"; my %children = map {split ' ', $_} split ',', $s; my %tree; # Build the tree push @{$tree{$children{$_}}}, $_ for keys %children; # Find the root my $root = (keys %children)[0]; while ($root) { last unless exists $children{$root}; $root = $children{$root}; } print "Root is $root\n"; Dump (\%tree);

    Prints:

    Root is c $HASH1 = { a => [ 'g', 'd' ], b => [ 'e', 'f' ], c => [ 'a', 'b' ] };

    Perl is environmentally friendly - it saves trees
Re: tree with variable number of children
by plobsing (Friar) on Jan 27, 2008 at 19:56 UTC
    Weaknesses (in my mind):
    • Walking up the tree and then down again
    • Using an array to store the relationships and then looping (posibly repeatedly) over that array to get the tree
    • Pathological case: linked list described tail first (you'll walk all the way back and then all the way forward again)
    • Pathological case: input discribing 2 disjoint trees (your format is flexible). Your code will likely loop FOREVER. It is an error to enter that, but you should still return.


    I think you could do it much more efficiently with a hash. Something to the effect of:
    my %nodes = (); foreach my $tuple (split ',', $s) { # shorter spliting my ($child, $parent) = split ' ', $tuple; # init nodes if not already done $nodes{$child} ||= { node => $child, children => []}; $nodes{$parent} ||= { node => $parent, children => []}; # insert relationship push @{$nodes{$parent}{children}}, \$nodes{$child} $nodes{$child}{parent} = $parent; }
    Then you could walk up the tree to get the root and already have a built tree.

    Update: your format is flexible enough to describe any directed graph. It is a reasonable expectation that code manipulating such data should halt for any acyclic input.