jjohhn has asked for the wisdom of the Perl Monks concerning the following question:

I have a toy hierarchical structure made out of Is_a relations. Each row is read left-to-right: a 'chevy' is_a 'car'. I did this to practice/learn recursion and references and the such. My script reads from a 2 column table (12 lines of __DATA__ appended to the script) and displays the elements.

I have two display subroutines; one gives all the unique descendants of a concept, and the other prints an indented tree-structure of the whole 'terminology'. Each of them recursively pops the elements off of a hash of "immediate children" that was constructed with a split of the original table data.

My problem: Either subroutine on its own works fine, but both together do not. Only one display (either the tree or the unique list) works, and the other shows only one element. I tried building the hash anew after each of these subroutine calls, (as the program shows) but that did not help.

I enclose the code with one of the routines commented out. Entering "taxonomy" at the prompt will display the whole tree. Thanks for any help.

John

#gives all descendents for a tree, given a root node use strict; use warnings; my ($curr_string, $in, $select, %immed_child, %all_descends); process_input(); load_hashes(); #make_display_string("$in"); #print "\n$curr_string\n"; load_hashes(); show_results(); sub process_input{ while(1){ print qq(enter concept name or "Q" to quit\n); chomp($in = <>); exit() if $in =~ /^\s*q\s*$/i; $in =~s/^\s*(\w+)\s*$/$1/; # still need to check if a valid concept print qq(enter:\n"A" for all descendents,\n"I" for immediate c +hildren\n"P" for parents\n); chomp($select = <>); $select =~ s/^\s*([aip])\s*$/$1/i; last; } } # key: parent # val: list of immediate children sub load_hashes{ my ($child, $par); while(<DATA>){ chomp; ($child, $par) = split/\t/; push @{$immed_child{$par}}, $child; } } #global hash needed to exchange vals w other subs sub get_descends{ my $root = shift; $all_descends{$root}++ if $root; while(1){ last unless my $immed = pop @{$immed_child{$root}}; get_descends($immed); $all_descends{$immed}++; } return keys %all_descends; } sub make_display_string{ my $root = shift; my $level = shift ; $curr_string = shift || $root; # $curr_string = "\n". $curr_string; $level++; while (1){ last unless my $immed = pop @{$immed_child{$root}}; $curr_string .="\n". "\t" x $level . $immed; make_display_string($immed, $level, $curr_string); } #$curr_string = "\n" . $curr_string; } sub show_results{ $" = "\n\t"; if ($select =~ m/a/i){ print qq(\ndescendents of \n"$in":\n); my @descends = get_descends($in); print "\t@descends\n"; } elsif ($select =~m/i/i){ if (@{$immed_child{$in}}){ print qq(\nimmediate children of \n"$in":\n); print "\t@{$immed_child{$in}}\n"; } else{ print "no immediate children of $in"; } } elsif ($select =~ m/p/i){ print "working on this\n"; } } __DATA__ vehicle taxonomy exercise device taxonomy computer topic taxonomy automobile vehicle ford automobile chevy automobile bike vehicles bike exercise device jumprope exercise device treadmill exercise device perl computer topic database computer topic

Replies are listed 'Best First'.
Re: recursive difficulty
by Enlil (Parson) on Mar 09, 2003 at 23:01 UTC
    A couple of things. First and foremost you once you have read everything out of <DATA> you would not be able to call the routine again to reload the data. (a simple example should show what i mean):
    use strict; use warnings; get_data(); get_data(); sub get_data { while ( <DATA> ) { print $_; } } ##get_data __DATA__ A B C D
    Second, instead of popping things off of the array (in the hash) and thus changing the data in the structure, I would use a construct like the following,so that the data structure stays intact and thus there would be no reason to rebuild it. What I mean is that, instead of this:
    while(1){ last unless my $immed = pop @{$immed_child{$root}}; get_descends($immed); $all_descends{$immed}++; }
    I would do something like this instead:
    foreach my $immed ( @{$immed_child{$root}} ) { get_descends($immed); $all_descends{$immed}++; }
    Anyhow, I hope this helps.

    -enlil

      Thank you. I did as you suggested, and it seems like my split has now somehow broke: I get the error "Use of unitialized value in concatenation (.) or string at tree.pl line 36, <DATA> line 1 (for all the <DATA> lines).

      I copied and pasted the code from perlmonks onto a different computer; that may be related.

      I thought perhaps my vim expandtabs option on this machine was involved, but I retyped the DATA using ^V. This code

      # key: parent # val: list of immediate children sub load_hashes{ my ($child, $par); while(<DATA>){ chomp; ($child, $par) = split /"\t"/; #push @{$immed_child{$par}}, $child; print "ch: $child, par: $par\n"; } }
      prints the entire table row as the value of $child, and no value for $par. For example:

      ch: jumprope exercise_device, par:

      this explains the unitialized value in the string. Is there an obvious mistake in my split that has somehow occurred in the copying of the code from perlmonks onto a new machine?

        That's one of the problem I had at the beginning. By copy and paste, all those \t's are gone. I just manually added those \t's back to your __DATA__. Those \t's are what you used in your split.
Re: recursive difficulty
by pg (Canon) on Mar 09, 2003 at 23:14 UTC
    Played with your code, the way you using pop is a big problem, not recursive.

    In those display functions, I don't think you want to change the content of the hash being displayed, but pop will remove those elements from your hash (to be more precise, elements are removed from those array refs, which are elements of your hash).

    When you try to display again, those elements are gone ;-).

    Made some little changes here and there, the main points are:
    1. Removed pops, instead just use foreach
    2. modified the sequence you call sub routines
    use strict; use warnings; my ($curr_string, $in, $select, %immed_child, $all_descends); load_hashes(); process_input(); sub process_input{ while(1){ print qq(enter concept name or "Q" to quit\n); chomp($in = <>); exit() if $in =~ /^\s*q\s*$/i; $in =~s/^\s*(\w+)\s*$/$1/; # still need to check if a valid concept print qq(enter:\n"A" for all descendents,\n"I" for immediate c +hildren\n"P" for parents\n); chomp($select = <>); $select =~ s/^\s*([aip])\s*$/$1/i; print $select, "\n"; show_results(); } } # key: parent # val: list of immediate children sub load_hashes{ my ($child, $par); while(<DATA>){ chomp; ($child, $par) = split/\t/; push @{$immed_child{$par}}, $child; } } #global hash needed to exchange vals w other subs sub get_descends{ my $root = shift; $all_descends->{$root}++ if $root; foreach my $immed (@{$immed_child{$root}}) { get_descends($immed); $all_descends->{$immed}++; } return keys %{$all_descends}; } sub make_display_string{ my $root = shift; my $level = shift ; $curr_string = shift || $root; # $curr_string = "\n". $curr_string; $level++; foreach my $immed (@{$immed_child{$root}}) { $curr_string .="\n". "\t" x $level . $immed; make_display_string($immed, $level, $curr_string); } #$curr_string = "\n" . $curr_string; } sub show_results{ $" = "\n\t"; if ($select =~ m/a/i){ print qq(\ndescendents of \n"$in":\n); $all_descends = {}; my @descends = get_descends($in); print "\t@descends\n"; } elsif ($select =~m/i/i){ if (@{$immed_child{$in}}){ print qq(\nimmediate children of \n"$in":\n); print "\t@{$immed_child{$in}}\n"; } else{ print "no immediate children of $in"; } } elsif ($select =~ m/p/i){ print "working on this\n"; } } __DATA__ vehicle taxonomy exercise device taxonomy computer topic taxonomy automobile vehicle ford automobile chevy automobile bike vehicles bike exercise device jumprope exercise device treadmill exercise device perl computer topic database computer topic


    Update1:

    Actually there are other problems, for example, when I try to get all descendents of bike, it displayed a list for me, when bike even does not have immediate descedent.

    Your get_descendent function is changing the content of global %all_descents each time it is called. that should not happen. This is a good lesson, why one should put variables into smaller scopes.

    I didn't fix this, you can try to fix it.

    Update2:

    I really should leave this to you, but just cannot avoid touching your code.

    Now I changed your all_descents from hash to hash ref. And added one line to reset $all_descents to {} each time before you get into the recursive. This fixes the problem I described in update1.

    Still there is one thing I consider as a bug (of course there could be more bugs, I am not throughly testing this ;-). It is that, when I try to display all descendents of bike, it would take bike as a descendents of itself, which I don't think right.

    Give you a hint, the problem is the second line of get_descent(), now I really leave this to you.

    One thing I did bad is that I fixed some problems without modify the style of your code. This should not lead you to think that you can continue with your style. I am absolutely serious ;-), that you better use variables with smaller scopes.

    Update3:

    jjohhn, I have read your reply.

    1. There is no problem with your hash, the only reason I changed it to hash ref is that, I want to do $hash = {} to reset. Well, actually I could do a %hash = ().
    2. I didn't say your subroutines are not subroutines ;-) What I said is that I changed the when/where to call your subroutines, for example, now it only reads the file once...
      Thank you. Clearly popping off the hash and trying to reread <DATA> are my biggest problems...

      I agree that I don't like the global hash, which is why I commented on it at the top of get_descends. I spent a lot of time trying to make that variable local to the subroutine, but the recursive re-running of the subroutine walked over the values in the hash. I want to do this better.

      how does changing the global hash to a global hash reference help? pardon me if that is a stupid question.

      I actually wanted the descendants to include the root, because originally that function could be used to display the entire terminology. Now that the display string displays everything, I don't need to display the root in show_results()

      why do you refer to "the sequences I call subroutines" Please explain so I can understand why they are not subroutines.

      Thank you for your help.

        Thanks all. I got my text editor to cooperate with the tabs and changed to foreaches. I'd like to print the tree in a slightly different order, so the children stack more nicely under each other.
        Right now it shows
        vehicle
        ---automobile
        ---------ford
        ---------chevy
        ---bike

        But I'm pretty satisfied to clear up the foreach business and the __DATA__ business. Once again I am grateful.
        John