Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

Hi nasa,

What you are doing is a fairly classical data structure manipulation. You have a tree in list form. Or a list of node id's and their parent id's. The terms you have used are little different, usually we say that we have a parent-child relationship, and we dont talk about numbers, but rather id's or identifiers. I say all this because in future questions you may find the terms useful and easier to explain what you want to do. Anyway, here is yet another soluton of your problem.

use strict; use warnings; sub ancestors { my $node_parents = shift; my $node = shift; my @ancestors; while (defined($node_parents->{$node})) { push @ancestors,$node_parents->{$node}; $node=$node_parents->{$node}; } return wantarray ? @ancestors : \@ancestors } my %node_parent; #my $data_file=""; #open my $dat_fh, $data_file or die "Could not open $data_file: $!"); #while (<$dat_fh>) { while (<DATA>) { if (/^\s*(\d+)\s+(\d+)/) { $node_parent{$1}=$2; } } my $check_node='6'; print "Ancestory of $check_node from newest to oldest:\n"; print join(", ",ancestors(\%node_parent,$check_node)); # the tree looks like: # 0 # +-+-+ # 1 2 # +-+ +++ # 3 4 5 # +-+-+ # 6 7 8 # # so we should get the output: # Ancestory of 6 from newest to oldest: # 3, 1, 0 __DATA__ 1 0 2 0 3 1 4 2 5 2 6 3 7 3 8 3

Potentially you dont need to read the whole list first. This would be true if and only if every parent had a lower id of all of its children and and every "son" id in the list was greater in id than all of the preceding son id's.

Depending on what types of operations you will frequently need to perform you may wish to generate a more complicated data structure than a simple parent list. (which is what your data represents and what the hash does). You may wish to do some thing like this:

use strict; use warnings; # subroutines for walking a node tree # each takes a hash of nodes as the first parameter # and the id of the node to start with # # each $nodes->{$id} is of the form # { # id => $id, # integer # parent => undef, # undef for no parent, otherwise an integer # children => [3,5,6], # undef or a ref to an array of optional chil +dren. # } sub dump_nodes { my $nodes=shift; foreach my $node ( sort { $a->{id} <=> $b->{id} } values %$nodes ) { printf "Node %10s Parent: %- 10s Children: %s\n", $node->{id}, defined($node->{parent}) ? $node->{parent} : "undef", join(", ",sort {$a <=>$b } @{$node->{children}||[]}); } } sub dump_nodes_as_tree { my $nodes = shift; my $node_id = shift; my $depth = shift||0; unless (defined $node_id) { my @root_nodes=grep { !defined $_->{parent} } values %$nodes; foreach my $root ( sort { $a->{id} <=> $b->{id} } @root_nodes +) { dump_nodes_as_tree($nodes,$root->{id},0); } } else { my $node=$nodes->{$node_id}; printf "%s%d(%s)[%s]\n", (" " x $depth), $node_id, defined($node->{parent}) ? $node->{parent} : "undef", join(",",sort {$a <=>$b } @{$node->{children}||[]}); foreach my $child_id (sort {$a<=>$b} @{$node->{children}||[]}) + { dump_nodes_as_tree($nodes,$child_id,$depth+1); } } } sub ancestors { my $nodes = shift; my $node_id = shift; my @ancestors; while (defined($nodes->{$node_id}{parent})) { push @ancestors,$nodes->{$node_id}{parent}; $node_id=$nodes->{$node_id}{parent}; } return wantarray ? @ancestors : \@ancestors } sub children_eldestline { # perorder traversal of the tree my $nodes = shift; my $node_id = shift; my $children = shift || []; foreach my $child_id (sort {$a<=>$b} @{$nodes->{$node_id}{children +}||[]}) { push @$children,$child_id; children_eldestline($nodes,$child_id,$children); } return wantarray ? @$children : $children } sub children_bygen { # breadthfirst traversal of the tree my $nodes = shift; my $parent_id = shift; my @children; my @queue=($parent_id); while (@queue) { my $node_id=shift @queue; foreach my $child_id (sort {$a<=>$b} @{$nodes->{$node_id}{chil +dren}||[]}) { push @children,$child_id; push @queue,$child_id; } } return wantarray ? @children : \@children; } my %nodes; #my $data_file=""; #open my $dat_fh, $data_file or die "Could not open $data_file: $!"); #while (<$dat_fh>) { while (<DATA>) { if (/^\s*(\d+)\s+(\d+)$/) { #print "$2 is parent of $1\n"; $nodes{$1}{id}=$1; $nodes{$1}{parent}=$2; $nodes{$2}{id}=$2; push @{$nodes{$2}{children}},$1; } } print "Nodes:\n"; print dump_nodes(\%nodes),"\n"; print "Nodes as tree:\n"; print dump_nodes_as_tree(\%nodes),"\n"; print "Ancestory of 6 from newest to oldest (parent list):\n"; print join(", ",ancestors(\%nodes,6)),"\n"; print "Children of 0 by the liniage of the oldest child (pre-order):\n +"; print join(", ",children_eldestline(\%nodes,0)),"\n"; print "Children of 0 in by generation (breadth-first):\n"; print join(", ",children_bygen(\%nodes,0)),"\n"; # the tree looks like: # 0 # +-+-+ # 1 2 # +-+ +++ # 3 4 5 # +-+-+ # 6 7 8 # # so we should get the following output: # # Nodes: # Node 0 Parent: undef Children: 1, 2 # Node 1 Parent: 0 Children: 3 # Node 2 Parent: 0 Children: 4, 5 # Node 3 Parent: 1 Children: 6, 7, 8 # Node 4 Parent: 2 Children: # Node 5 Parent: 2 Children: # Node 6 Parent: 3 Children: # Node 7 Parent: 3 Children: # Node 8 Parent: 3 Children: # # Nodes as tree: # 0(undef)[1,2] # 1(0)[3] # 3(1)[6,7,8] # 6(3)[] # 7(3)[] # 8(3)[] # 2(0)[4,5] # 4(2)[] # 5(2)[] # # Ancestory of 6 from newest to oldest (parent list): # 3, 1, 0 # Children of 0 by the liniage of the oldest child (pre-order): # 1, 3, 6, 7, 8, 2, 4, 5 # Children of 0 in by generation (breadth-first): # 1, 2, 3, 4, 5, 6, 7, 8 __DATA__ 1 0 2 0 3 1 4 2 5 2 6 3 7 3 8 3
which builds a tree of nodes with previous pointer and children pointers and provides a couple of utility subs for finding the children (in different forms), printing out the data structure and the like. A more robust, but probably too confusing approach for right now would be to use real references instead of id lookups (actually on second thought using pure references might be less confusing, but for now whatever :-). This would have the advantage that if you had operations like delete in mind that you could delete entire subtrees with a single assignment.

Anyway, hopefully you find this node and its contents a useful introduction into different ways of representing heirarchical relationships, and operations that can be performed on them. I hope however that you will take the time to search on "Tree representation" and "Trees" for tree specific material (Binary Tree's are the canonical data structure in many respects) and obtain a book like "Algorithms and Data Structures in Perl" (O'Reiley). This will walk you through the different forms of representations in much more depth and precision.

You will need to look under "Graphs" and "Graph theory" if you want to research up on general material related to this. To computer science people and mathematicians all of this stuff comes under the heading "Graph Theory". A tree with only parent pointers or only child pointer is technically called a "DAG", directed acyclic graph, and when it has both parent and child pointers it is a "Directed Graph". There are also other names originating in specific subfields, such as databases (where they talk about keys and constrained indexs), networking (a graph is a network amd vice versa), or various forms of system programming where nodes, pointers and children are the terms used, or graphics where they use lots of different terms (its not my field :-))

Cheers and good luck. And definately dont put off asking a question about any of this if you feel the slightest confusion. If something is over your head we will do our best not to let you drown. :-)


In reply to Re: Correction to Database problem by demerphq
in thread Correction to Database problem by nasa

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (3)
As of 2023-03-23 08:18 GMT
Find Nodes?
    Voting Booth?
    Which type of climate do you prefer to live in?

    Results (60 votes). Check out past polls.