Graph model problem.

newbio has asked for the wisdom of the Perl Monks concerning the following question:

Since there was no response from dear Monks to my earlier post..:(, I guess I need to present the problem in a more
visual manner by taking an example. Thus, I have redrafted my code and the problem. Hope, this time it works and dear Monks get pleased and solve my problem...:).

OK, the program does the following:

Given the graphical network topology with all nodes and their associated values, the program creates conditional
probability tables (cpt) associated with each node in the network. As an example, I have tried to implement the
graphical network shown here: http://www.snn.ru.nl/~wimw/asci2003/bayesintro.html . Please note that CPTs in the
example are the tables shown next to each node (see the URL). So, for each combination of values of a particular node
and its parents a probability value is assigned in the node cpt. Since the probability is conditional, all rows in a cpt add up to
1. My program does not learn the probability values "yet", and simply assigns 1 for each of those probability slots.

However, there appears some error in the code somewhere which I am unable to detect. When I pass single node name in
my code through @nodes (such as @nodes=('sprinkler'); or @nodes=('wetgrass') etc);, the result (i.e. cpt) comes
out correct. However, when I pass multiple nodes through @nodes (such as
@nodes=('sprinkler','cloudy','wetgrass','rain');) then the first cpt (for node sprinkler) in the result comes out correct, however, the
remaining ones are wrong (output shown below).
Please guide me as to where the fault lies in the code. Also, are you happy with my data structures for cpts?

The idea is to use the graph structure for training and inference later.

##############
Actual program Output (the first cpt i.e for sprinkler is correct, while the rest aren't):

sprinkler cpt:
cloudy sprinkler =>1
f t=>1
f f=>1
t f=>1
t t=>1

cloudy cpt:
=>1

wetgrass cpt:
rain wetgrass =>1
f t=>1
f f=>1
t f=>1
t t=>1

rain cpt:
=>1

whereas the desired output should be:

sprinkler cpt:
cloudy sprinkler =>1
f t=>1
f f=>1
t f=>1
t t=>1

cloudy cpt:
cloudy =>1
f=>1
t=>1

wetgrass cpt:
sprinkler rain wetgrass =>1
f f f=>1
t t f=>1
t f t=>1
f t t=>1
t f f=>1
f t f=>1
t t t=>1
f f t=>1

rain cpt:
cloudy rain =>1
f t=>1
f f=>1
t f=>1
t t=>1
##############

#!/usr/bin/perl
use warnings;
use strict;


my $couldy=['cloudy','t','f']; # 'cloudy' is the node name and 't','f'
+ are the values this node can assume. Likewise for other nodes.
my $sprinkler=['sprinkler','t','f'];
my $rain=['rain','t','f'];
my $wetgrass=['wetgrass','t','f'];
my $values=[$sprinkler,$couldy,$rain,$wetgrass]; 


my @nodes=('sprinkler','cloudy','wetgrass','rain');

foreach my $i (@nodes) {
    my $nodeparents=[parentchildrelationship($i)];
    print "$i $nodeparents $values\n";
    my $hash1=cpt($i,$nodeparents,$values);

    foreach my $i (keys %{$hash1}) {
        print "$i=>${$hash1}{$i}\n";
        #print "$i\n";
    }
}


sub cpt {
    my $node=$_[0];
    my @parents=@{$_[1]};
    my @nodevalues=@{$_[2]};
    my %hash=();
    my @nodeindex=();
    my @temparray=();
    my $string="";
    my @temp=();
    my $s;

    foreach my $i (@parents) {
        for (my $j=0;$j<=$#nodevalues;$j++) {
                if ($i eq ${$nodevalues[$j]}[0]) {
                    push (@nodeindex, $j); 
                }
        }
    }
    
    foreach my $i (reverse @nodeindex) { 
        push (@temparray, $nodevalues[$i]); 
    }

    for (my $i=0;$i<=$#temparray;$i++) {
        $string="$string"."${$temparray[$i]}[0] "; 
        shift @{$temparray[$i]};
        $temp[$i]=join (",",@{$temparray[$i]});
    }

    $hash{$string}=1; #fill %hash with the header (header is simply th
+e name of the node along with its parents)
    $s = join "\\ ", map "{$_}", @temp;
    #print "$s\n";
    $hash{$_}=1 for glob $s; 
    
    return {%hash}; 
    
    $node="";
    @parents=(); 
    @nodevalues=();
    %hash=();
    @nodeindex=();
    @temparray=();
    $string="";
    @temp=();
    $s="";
}


sub parentchildrelationship {

    my $node=$_[0]; #or use "shift"
    my %parentchild=();
    my @nodeset=();

    %parentchild=('cloudy'=>['none'], 'sprinkler'=>['cloudy'], 'rain'=
+>['cloudy'], 'wetgrass'=>['rain','sprinkler']); #This is the structur
+e of the graph, for example node 'cloudy' has no parent; node 'sprink
+ler' has 'cloudy' as its parents;  node 'rain' has 'cloudy' as its pa
+rents;  node 'weygrass' has 'rain' and 'sprinkler' as its parents; (d
+irectionality of the arrow determines the child-parent relationship).
    push (@nodeset, $node, @{$parentchild{$node}}); 
    return @nodeset;
}
[download]

Comment on Graph model problem. Download Code

Replies are listed 'Best First'.
Re: Graph model problem. by kyle (Abbot) on May 09, 2007 at 19:04 UTC
Having looked over your code, I think I found your problem, and I have a few other comments. The only substantial change I made was the assignment to `@nodevalues` in the `cpt` sub. It now reads: `my @nodevalues= map { [ @{$_} ] } @{$_[2]};` [download] The reason why that helps is that `$_[2]` is a reference to an array containing other array references. What my change does is copy the arrays in the main array instead of copying the references. This is important because further down in your code, you do "`shift @{$temparray[$i]};`", which modifies one of those arrays. Without my modification, here is what happens. Before the first call to `cpt`, this is what `$values` looks like: `$VAR1 = [ [ 'sprinkler', 't', 'f' ], [ 'cloudy', 't', 'f' ], [ 'rain', 't', 'f' ], [ 'wetgrass', 't', 'f' ] ];` [download] After the first call to `cpt`, it looks like this: `$VAR1 = [ [ 't', 'f' ], [ 't', 'f' ], [ 'rain', 't', 'f' ], [ 'wetgrass', 't', 'f' ] ];` [download] Notice that the first two arrays have been modified. The above is the output of Data::Dumper, by the way. It's a pretty good way of visualizing your data structures. The fix I've shown will help this particular problem, but it would not have helped if the data structure involved were another level deep. For a more general solution to this kind of problem have a look at merlyn's column Deep copying, not Deep secrets. My other comments, briefly: I'm not sure why you'd be using glob. This code at the end of `cpt` is not necessary (and not executed): `$node=""; @parents=(); @nodevalues=(); %hash=(); @nodeindex=(); @temparray=(); $string=""; @temp=(); $s="";` [download] Because those are all lexical variables, they get new values every time `cpt` is called. In general, things like `for (my $j=0;$j<=$#nodevalues;$j++)` ought to be more like `foreach my $nv ( @nodevalues )`. You might want to look at some of CPAN's graph modules.	[reply] [d/l] [select]
Re: Graph model problem. by jdporter (Paladin) on May 09, 2007 at 19:00 UTC
`return {%hash}; $node=""; @parents=(); @nodevalues=(); %hash=(); @nodeindex=(); @temparray=(); $string=""; @temp=(); $s=""; }` [download] Are you aware that the effect of a return statement is immediate? All those assignments after the return, above, don't get executed. A word spoken in Mind will reach its own level, in the objective world, by its own weight	[reply] [d/l]
Re: Graph model problem. by johngg (Canon) on May 09, 2007 at 22:03 UTC
kyle has given you a very informative response which will hopefully move you onwards. I have a couple of comments about some of your code which may help you in the future, or at least save you some typing. Some consider it good practice to distinguish scalars holding simple values from those holding references to other data types, e.g. `$ref_to_array_of_colours = ['red', 'blue', 'green];`. I seem to be in a minority in preferring to use camel-case notation and would use the more succinct `$raColours`; I would use `$rhWhatever` for a hash reference, `$rsThing` for scalar ref. etc. I follow this scheme in my suggestions below. This section of your code `my $couldy=['cloudy','t','f']; # 'cloudy' is the node name and 't','f' + are the values this node can assume. Likewise for other nodes. my $sprinkler=['sprinkler','t','f']; my $rain=['rain','t','f']; my $wetgrass=['wetgrass','t','f']; my $values=[$sprinkler,$couldy,$rain,$wetgrass];` [download] Could be re-written without the intermediate variables as you don't seem to use them again. `my $raValueSets = [ ['cloudy', 't', 'f'], ['sprinkler', 't', 'f'], ['rain', 't', 'f'], ['wetgrass', 't', 'f'], ];` [download] Passing arguments to `sub cpt { ... }`, I would tend to keep the anonymous arrays as they are and de-reference them later, like this `sub cpt { my ($node, $raParents, $raNodeValues) = @_; ... foreach my $i ( @$raParents ) { for my $j ( 0 .. $#$raNodeValues ) { if ( $i eq $raNodeValues->[$j]->[0] ) { push @nodeindex, $j; } } } ... }` [download] Note that I use the `->` de-reference operator rather than the `${$x}[n]` notation, which can quickly become unreadable. In `sub parentchildrelationship { ... }` you do `my $node=$_[0]; #or use "shift"` but you can also do `my ($node) = @_;` [download] which puts the LHS in list context and assigns the first element of `@_` to the first element of that list, namely `$node`. Actually, I tend to use `shift`. I hope these points are of interest. Cheers, JohnGG	[reply] [d/l] [select]
Re: Graph model problem. by Limbic~Region (Chancellor) on May 10, 2007 at 13:54 UTC
newbio, In a number of /msg's, comments in the chatterbox, and a reply to your previous node - I have recommended that you do two things to get better responses. A visual representation of a simple example The desired output for that example I know this must be frustrating for you but, in my opinion, you still coming up short. For instance, here is one way you may have provided a visual representation of your network (which really isn't a tree): Read more... (506 Bytes) While you did provide the desired output, there is no explanation as to what it means. For instance, I would have probably labeled the output much better: Read more... (334 Bytes) You likely have a relatively simple mistake as indicated by the other replies in this node but that isn't easy for me to see. In fact, when I run your code I do not even get the output you claim it produces (array refs stringified in quotes). Your description of your problem does not make your intended results clear and your code is also not easy to digest. I could take the time to disect your problem and rewrite your code but that is what you should have done. Cheers - L~R	[reply] [d/l] [select]


Perl-Sensitive Sunglasses
	PerlMonks