Amphiaraus has asked for the wisdom of the Perl Monks concerning the following question:

Hi All -- If you have time I would appreciate very much if you might answer the following questions about Tree::Nary.

1) I would like to create Tree::Nary nodes whose "data" field is a hash containing information on Clear Case versions of directories and files.

At present I have 2 strings in this hash:
my %node_data;
$node_data{dir_name} = "/vobs/engine_cdma2000/code/dmss/apps/MediaPlayer";
$node_data{dir_vep} = "/vobs/engine_cdma2000/code/dmss/apps/MediaPlayer@@/main/par_x_haloc9_s74.47.9/bld_01.06.00_ca25_krmt43_haloc/1";
$root = Tree::Nary->new(%root_node_data);
B) When I try to print contents of a hash "data" item, in my Tree::Nary node, with this function I have problems:
sub print_nary_dir_tree {
my $tree = shift;
my $printsub = sub {
my $node = shift;
my %node_data = $node->{data};
if(defined $node){
print "node_data dir_name holds $node_data{dir_name}\n";
print "node_data dir_vep holds $node_data{dir_vep}\n";
} return 0;
};

$tree->traverse($tree,
$Tree::Nary::IN_ORDER,
$Tree::Nary::TRAVERSE_ALL,
-1,
$printsub
);
}
~/builds/HALOC9_X_01.06.00R_NARY > nary_tree_testing.pl
Odd number of elements in hash assignment at nary_tree_testing.pl line 196.
Use of uninitialized value in concatenation (.) or string at nary_tree_testing.pl line 198.
node_data dir_name holds
Use of uninitialized value in concatenation (.) or string at nary_tree_testing.pl line 199.
node_data dir_vep holds

Can you let me know the proper way to print out contents of a Tree::Nary data field, if it is a hash?

2) I need to load a Tree::Nary tree with nodes that represent a number of Clear Case versions of directories. first few lines of my input file would look like this:
/vobs/engine_cdma2000/code/dmss
/vobs/engine_cdma2000/code/dmss/apps
/vobs/engine_cdma2000/code/dmss/apps/MediaPlayer

N-ary tree would look like this:

vobs

engine_cdma2000

code

dmss

apps

MediaPlayer

Can you suggest a good algorithm?

I was thinking of tokenizing each directory version and putting each dir it contains in an array

Each node would hold the "dir_name" it represents, example "/vobs/engine_cdma2000/code/dmss"

My algorithm for processing "/vobs/engine_cdma2000/code/dmss"

A) Is "/vobs/" in tree?, if not initialize tree with "/vobs/" node as root

B) Is "/vobs/engine_cdma2000/" in tree?, if not make engine_cdma2000 the child of the latest found directory node, "/vobs"

C) Is "/vobs/engine_cdma2000/code" in tree?, if not make "code" the child of the latest found directory node, "/vobs/engine_cdma2000"

D) Is "/vobs/engine_cdma2000/code/dmss" in tree?, if not make "dmss" the child of the latest found directory node, "/vobs/engine_cdma2000/code"

3) Once N-ary tree of directory versions is loaded I must use this tree to divide the directories into groups, to be dealt with on parallel processes.

The tree must be disassembled into smaller such trees, with one or more such trees being assigned to a group, such that if a given directory is in the input file of directories, all subdirectories of it must be in the same group.

I need to copy these directories from a developer branch to a core branch, and I cannot allow "super" directory X and subdirectory Y to be in separate groups, because this risks having a failed attempt to copy a new subdirectory Y to the core branch BEFORE "super" directory X -- directory X is what gives visibility to a newly created subdirectory Y.

This is an even harder problem than the original Tree:Nary Tree creation, but I would appreciate any hints you might give.

Thanks,
Amphiaraus

Replies are listed 'Best First'.
Re: Tree::Nary n00b has questions
by jethro (Monsignor) on Aug 24, 2008 at 22:49 UTC

    First of all, please edit your node and put <code> tags around your code segments

    Here is how your sub might look like (untested):

    my $printsub = sub { my $node = shift; if (defined $node) { my $node_data = $node->{data}; if (defined $node_data and ref $node_data eq 'HASH') { print "node_data dir_name holds $node_data->{dir_name}\n"; print "node_data dir_vep holds $node_data->{dir_vep}\n"; } } return 0; }

    I put the assignment of $node_data after the test whether $node is defined. Also there is no need to copy the hash so I changed it to a direct access of the node data. If you prefer the copy, use my %node_data = %{$node->{data}}; for the assignement instead. The test for definedness of $node should be unnecessary, the module won't call the routine without a node.

    UPDATE: added defined-test of $node_data

Re: Tree::Nary n00b has questions
by tilly (Archbishop) on Aug 25, 2008 at 03:34 UTC
    I've never heard of Tree::Nary before. (Searches.) Hrm, based on http://cpanratings.perl.org/dist/Tree-Nary and what I understand about how Perl works internally, I wouldn't suggest using it. Instead I would suggest thinking in terms of native Perl data structures, using references to build more complex ones. That will be less code, should perform better, and will save memory. Read References quick reference if you need to learn how to build more complex references.

    Still if you wish to persist in this approach, the answer to your first question is that you should learn how to handle references. While you're learning, Data::Dumper may be a good friend. I would handle your second question by turning each entry into a small anonymous hash, and I'd collect an array of those. Then post-process that array. For instance sort it by directory then group that up. (Note that I would not organize it using [cpan://Tree::Nary, but you could if you wanted to.) My first answer to your third question is that you should not be planning on parallelizing your code until after you have a proven performance problem. If you do, then as I said I would organize the data by sorting it first then running some grouping logic on it.

Re: Tree::Nary n00b has questions
by dragonchild (Archbishop) on Aug 25, 2008 at 12:12 UTC
    Use Tree instead of Tree::Nary. It better supports the traversal and value needs you're describing. Plus, allows you to more easily introspect the tree to divide it.

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
      Thanks all, I am very impressed by the quick responses, there's a lot of information here I can use.

      On a lighter note I wonder if there are Perl T-shirts. At science fiction conventions I often see Sci-Fi T-shirts with amusing messages, like "Livin' La Vida Dorka", or "The Three Byzantine Stooges" in which the main Byzantine religious heresies are translated into Three Stooges terms. :)
        I wonder if there are Perl T-shirts
        Buy Stuff