walking a tree

punkish has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

My brain is tired from staring at this. I need your sharp eyes to tell me what I am doing wrong. I am working on the basis of a tutorial on relational modeling of trees and graphs (see http://www.oreillynet.com/lpt/a/2958), aka "preorder traversal." I have the following data

table:
child_node_id  parent_node_id  left_id  right_id
-------------  --------------  -------  --------
2              1             
3              1             
4              1             
5              1             
6              2             
7              2             
8              4             
9              5             
10             7             
9              10   

my $children = $dbh->prepare(qq{
  SELECT child_node_id FROM table WHERE parent_node_id = ?
});

my $setleft = $dbh->prepare(qq{
  UPDATE table SET left_id = ? WHERE child_node_id = ?
});

my $setright = $dbh->prepare(qq{
  UPDATE table SET right_id = ? WHERE child_node_id = ?
});

my $ctr = 1;
my $rootid = 1;

walktree($rootid);

sub walktree {
  my ($id) = @_;
  
  $setleft->execute($ctr++, $id);
  $children->execute($id);
  
  while(my ($subid) = $children->fetchrow_array) {
    walktree($subid);
  }
  
  $setright->execute($ctr++, $id);
}
[download]

However, the code shown above does not work. It starts off with

parent_node_id = 1 and finds its four child_nodes (2, 3, 4, and 5)
- starts working on parent_node_id = 2 (the first of the four nodes found above) and finds its two child_nodes (6 and 7)
  - starts working on parent_node_id = 6 (the first of the two nodes found above)
  - since 6 is a terminal node (no child nodes for 6), it is supposed to go on to 7, but that doesn't happen

What am I doing wrong?

Update: Solved. See replies from gamache and blokhead below.

--

when small people start casting long shadows, it is time to go to bed

Comment on walking a tree Download Code

Replies are listed 'Best First'.
Re: walking a tree by gamache (Friar) on Nov 13, 2007 at 03:46 UTC
Each successive pass of `walktree` clobbers the last results of `fetchrow_array`, since it executes on the same statement handle. One solution is to fetch all the rows at once, like this: `my ($subid, @children); while (($subid) = $children->fetchrow_array) { push @children, $subid; } for $subid (@children) { walktree ($subid); }` [download] instead of: `while(my ($subid) = $children->fetchrow_array) { walktree($subid); }` [download]	[reply] [d/l] [select]
Re: walking a tree by blokhead (Monsignor) on Nov 13, 2007 at 03:48 UTC
Each recursive call to `walktree` uses the same `$children` statement handle. So most likely, each time a recursive call re-executes the `$children` statement, it clobbers whatever was pending for the caller in that statement handle. When the control returns to the caller, there is nothing left to iterate over. If you think about how things happen sequentially, `walktree` only returns once `$children->fetchrow` fails. Then right after it returns, the first thing the caller tries to do is `$children->fetchrow`! To fix this, either make new statement handles within each call to `walktree`, or else fetch all of the results from the `$children` query into an array before recursing at all. blokhead	[reply]
Re: walking a tree by dragonchild (Archbishop) on Nov 13, 2007 at 14:54 UTC
Umm ... creating doubly-linked lists in databases should be avoided at (almost) all costs. It violates the DRY principle of normalization. That's "Bad"(tm). This looks like a really bad direct translation of a procedural data structure into a relational model. Don't do that. Trees are generally represented in one of two fashions in a database, each with its benefits/drawbacks. The first is the way you're doing it - a table that has data, an id, and a parent_id. To find all children, you ask `SELECT id FROM table WHERE parent_id = ?`. If you need ordering (such as a binary tree), add in an order column. This is a really good fashion for mutations, but it sucks for queries, unless you're on Oracle or Postgres and have the CONNECT BY extension (and even then it still sucks). The other method (nested sets) is a bit harder to explain. Take a look at http://dev.mysql.com/tech-resources/articles/hierarchical-data.html for a good article on the topic. Less efficient on mutations, but much more efficient in terms of queries, including traversals. My criteria for good software: Does it work? Can someone else come in, make a change, and be reasonably certain no bugs were introduced?	[reply] [d/l]