Limbic~Region has asked for the wisdom of the Perl Monks concerning the following question:

All: In this node, Isanchez asked for some code that would allow entering an arbitrary start point and return a string of decendents. While there were many fine answers, they were "tree" based and not flat. I came up with following monstrosity:
package Isanchez; use strict; use Carp; our $VERSION = '0.01'; sub new { my ($class, $file) = @_; my $self = bless {}, $class; $self->_build($file); return $self; } sub _build { my ($self, $file) = @_; croak "Unable to retrieve data" unless open(INPUT,$file); while (<INPUT>) { my @field = split; unless (@field == 3 && $field[0] =~ /^\d+$/ && $field[0] =~ /^ +\d+/) { carp "Unable to parse\n$_"; next; } $self->{$field[2]}{decendent} = $field[0]; push @{$self->{decendents}[$field[1]]}, $field[2]; } } sub get_decendents { my ($self, $origin) = @_; croak "You must enter an origin in the tree" unless $origin; croak "$origin can not be found" unless exists $self->{$origin}; my $index = $self->{$origin}{decendent}; my @list = @{$self->{decendents}[$index]} if exists $self->{decend +ents}[$index]; return $origin unless @list; my @decendents = ($origin); while (@list) { push @decendents, @list; my @temp; for my $decendent (@list) { my $index = $self->{$decendent}{decendent}; push @temp, @{$self->{decendents}[$index]} if exists $self +->{decendents}[$index]; } @list = @temp; } return join ', ' , @decendents; } #### Example Usage: #### #!/usr/bin/perl -w use strict; use Isanchez; my $family_tree = Isanchez->new('file.dat'); print $family_tree->get_decendents('dog'), "\n"; # Where file.dat looks like __END__ 1 0 domestic_animal 2 1 dog 3 2 terrier 4 2 collie 5 3 fox_terrier
While this works as requested, the get_decendents method looks very amaturish to me. Before I go on to create the other methods I dreamed up:
  • get_acendants
  • get_decendants_tree # Same data in tree format
  • get_acendants_tree # Same data in tree format

    I was hoping one of you fine monks could show me a more elgeant/efficient/perlish solution.

    Cheers - L~R

  • Replies are listed 'Best First'.
    Re: Code Review (recursive data structure)
    by diotalevi (Canon) on Jul 29, 2003 at 22:18 UTC

      You'll have to update your node fetching routine but I fixed up the bugs in your _build routine. As for fetch routine itself - it is highly unperly in that instead of just returning @descendants you create some string that would have to be processed by the script user to get the children. I happen to think I had a very nice solution in Re: recursive complex structure or something like that? since it doesn't require the moduler writer to manually dereference the pointers. What you did was swap my weak references for the equivalent of symbolic references and in general I think having real references is a better idea.

      Also note that your fetch routine can be forced into an infinite loop if there is a cycle in the parent/child relationships. On an overall quality level, you need to use better variable names than @fields. Your identifiers should be descriptive. Heck, its part of your documentation.

      sub _build { my ($self, $file) = @_; croak "Unable to retrieve data" unless open(INPUT,$file); # You just stomped on the global $_. You have to either assign it # to a pad variable (like I did here) or local $_ first. Just a # plain while(<FH>) is dangerous. while (my $line = <INPUT>) { # You didn't pick named variables. Variables should have meani +ngful # names so you don't have to guess at what $field[2] is later +in # the code. my ($id, $parent_id, $title) = split ' ', $line; unless ( $id =~ /^\d+$/ and $parent_id =~ /^\d+$/ ) { carp "Unable to parse\n$line"; next; } # You were assigning the id to descendants. That should be sto +red # separately. Also, you used the hash keys of the object for t +he # node titles but also expect the 'decendents' key to remain u +nused. # You should keep that separate somewhere. $self->{'title2id_dict'}{$title} = $id; # Shall we assume the IDs all start low and go up? You'll cons +ume # much memory if an ID comes in like 1000000. I compensate for + that # and just use a hash. push @{$self->{'descendants'}{$parent_id}}, $id; } }
        diotalevi, Thanks - let me work through your points:
      • Returning string instead of @descendants is un-perlish
             Agreed. Because the desired output was a string I encorporated that into the module. It would have been better i guess to show the usage as print (join ',' $family_tree->get_decendents('dog')) , "\n"; and just returned @descendants
      • Your solution was nice
             But it didn't meet Isanchez's requirements about being a single line of output starting at an arbitrary point.
      • Infinite loop
             yes, if a node has a pointer to a child that is actually a parent - it will cause an infinite loop. That was one of the things I was working on when I decided it was just "yucky" code.
      • equivalent of symbolic references
             I don't understand this statement
      • Better variable names
             Could not agree more
      • My use of key assignment
             I disagree with your synopsis of what I did (but could very will just be misunderstanding you). I did not assign the id to the decendents, it is the array index where the decendents can be found. As far as the possibility that I may step on myself if I get a title that is 'decendents' (I think that is what you meant) - I could move it to a private base key.
      • hash "index" versus array for memory purposes
             I agree
      • I will have to fix get_decendents myself
             This is the piece of code I was hoping to get help on :-)

        How does this look like for an update?

        Cheers - L~R