newbio has asked for the wisdom of the Perl Monks concerning the following question:
Since there was no response from dear Monks to my earlier post..:(, I guess I need to present the problem in a more visual manner by taking an example. Thus, I have redrafted my code and the problem. Hope, this time it works and dear Monks get pleased and solve my problem...:).
OK, the program does the following:
Given the graphical network topology with all nodes and their associated values, the program creates conditional probability tables (cpt) associated with each node in the network. As an example, I have tried to implement the graphical network shown here: http://www.snn.ru.nl/~wimw/asci2003/bayesintro.html . Please note that CPTs in the example are the tables shown next to each node (see the URL). So, for each combination of values of a particular node and its parents a probability value is assigned in the node cpt. Since the probability is conditional, all rows in a cpt add up to 1. My program does not learn the probability values "yet", and simply assigns 1 for each of those probability slots.
However, there appears some error in the code somewhere which I am unable to detect. When I pass single node name in my code through @nodes (such as @nodes=('sprinkler'); or @nodes=('wetgrass') etc);, the result (i.e. cpt) comes out correct. However, when I pass multiple nodes through @nodes (such as @nodes=('sprinkler','cloudy','wetgrass','rain');) then the first cpt (for node sprinkler) in the result comes out correct, however, the remaining ones are wrong (output shown below).
Please guide me as to where the fault lies in the code. Also, are you happy with my data structures for cpts?
The idea is to use the graph structure for training and inference later.
##############
Actual program Output (the first cpt i.e for sprinkler is correct, while the rest aren't):
sprinkler cpt:
cloudy sprinkler =>1
f t=>1
f f=>1
t f=>1
t t=>1
cloudy cpt:
=>1
wetgrass cpt:
rain wetgrass =>1
f t=>1
f f=>1
t f=>1
t t=>1
rain cpt:
=>1
whereas the desired output should be:
sprinkler cpt:
cloudy sprinkler =>1
f t=>1
f f=>1
t f=>1
t t=>1
cloudy cpt:
cloudy =>1
f=>1
t=>1
wetgrass cpt:
sprinkler rain wetgrass =>1
f f f=>1
t t f=>1
t f t=>1
f t t=>1
t f f=>1
f t f=>1
t t t=>1
f f t=>1
rain cpt:
cloudy rain =>1
f t=>1
f f=>1
t f=>1
t t=>1
##############
#!/usr/bin/perl
use warnings;
use strict;
my $couldy=['cloudy','t','f']; # 'cloudy' is the node name and 't','f'
+ are the values this node can assume. Likewise for other nodes.
my $sprinkler=['sprinkler','t','f'];
my $rain=['rain','t','f'];
my $wetgrass=['wetgrass','t','f'];
my $values=[$sprinkler,$couldy,$rain,$wetgrass];
my @nodes=('sprinkler','cloudy','wetgrass','rain');
foreach my $i (@nodes) {
my $nodeparents=[parentchildrelationship($i)];
print "$i $nodeparents $values\n";
my $hash1=cpt($i,$nodeparents,$values);
foreach my $i (keys %{$hash1}) {
print "$i=>${$hash1}{$i}\n";
#print "$i\n";
}
}
sub cpt {
my $node=$_[0];
my @parents=@{$_[1]};
my @nodevalues=@{$_[2]};
my %hash=();
my @nodeindex=();
my @temparray=();
my $string="";
my @temp=();
my $s;
foreach my $i (@parents) {
for (my $j=0;$j<=$#nodevalues;$j++) {
if ($i eq ${$nodevalues[$j]}[0]) {
push (@nodeindex, $j);
}
}
}
foreach my $i (reverse @nodeindex) {
push (@temparray, $nodevalues[$i]);
}
for (my $i=0;$i<=$#temparray;$i++) {
$string="$string"."${$temparray[$i]}[0] ";
shift @{$temparray[$i]};
$temp[$i]=join (",",@{$temparray[$i]});
}
$hash{$string}=1; #fill %hash with the header (header is simply th
+e name of the node along with its parents)
$s = join "\\ ", map "{$_}", @temp;
#print "$s\n";
$hash{$_}=1 for glob $s;
return {%hash};
$node="";
@parents=();
@nodevalues=();
%hash=();
@nodeindex=();
@temparray=();
$string="";
@temp=();
$s="";
}
sub parentchildrelationship {
my $node=$_[0]; #or use "shift"
my %parentchild=();
my @nodeset=();
%parentchild=('cloudy'=>['none'], 'sprinkler'=>['cloudy'], 'rain'=
+>['cloudy'], 'wetgrass'=>['rain','sprinkler']); #This is the structur
+e of the graph, for example node 'cloudy' has no parent; node 'sprink
+ler' has 'cloudy' as its parents; node 'rain' has 'cloudy' as its pa
+rents; node 'weygrass' has 'rain' and 'sprinkler' as its parents; (d
+irectionality of the arrow determines the child-parent relationship).
push (@nodeset, $node, @{$parentchild{$node}});
return @nodeset;
}
Re: Graph model problem.
by kyle (Abbot) on May 09, 2007 at 19:04 UTC
|
Having looked over your code, I think I found your problem, and I have a few other comments.
The only substantial change I made was the assignment to @nodevalues in the cpt sub. It now reads:
my @nodevalues= map { [ @{$_} ] } @{$_[2]};
The reason why that helps is that $_[2] is a reference to an array containing other array references. What my change does is copy the arrays in the main array instead of copying the references. This is important because further down in your code, you do "shift @{$temparray[$i]};", which modifies one of those arrays. Without my modification, here is what happens. Before the first call to cpt, this is what $values looks like:
$VAR1 = [
[
'sprinkler',
't',
'f'
],
[
'cloudy',
't',
'f'
],
[
'rain',
't',
'f'
],
[
'wetgrass',
't',
'f'
]
];
After the first call to cpt, it looks like this:
$VAR1 = [
[
't',
'f'
],
[
't',
'f'
],
[
'rain',
't',
'f'
],
[
'wetgrass',
't',
'f'
]
];
Notice that the first two arrays have been modified. The above is the output of Data::Dumper, by the way. It's a pretty good way of visualizing your data structures.
The fix I've shown will help this particular problem, but it would not have helped if the data structure involved were another level deep. For a more general solution to this kind of problem have a look at merlyn's column Deep copying, not Deep secrets.
My other comments, briefly:
- I'm not sure why you'd be using glob.
- This code at the end of cpt is not necessary (and not executed):
$node="";
@parents=();
@nodevalues=();
%hash=();
@nodeindex=();
@temparray=();
$string="";
@temp=();
$s="";
Because those are all lexical variables, they get new values every time cpt is called.
- In general, things like for (my $j=0;$j<=$#nodevalues;$j++) ought to be more like foreach my $nv ( @nodevalues ).
- You might want to look at some of CPAN's graph modules.
| [reply] [d/l] [select] |
Re: Graph model problem.
by jdporter (Paladin) on May 09, 2007 at 19:00 UTC
|
return {%hash};
$node="";
@parents=();
@nodevalues=();
%hash=();
@nodeindex=();
@temparray=();
$string="";
@temp=();
$s="";
}
Are you aware that the effect of a return statement is immediate?
All those assignments after the return, above, don't get executed.
A word spoken in Mind will reach its own level, in the objective world, by its own weight
| [reply] [d/l] |
Re: Graph model problem.
by johngg (Canon) on May 09, 2007 at 22:03 UTC
|
my $couldy=['cloudy','t','f']; # 'cloudy' is the node name and 't','f'
+ are the values this node can assume. Likewise for other nodes.
my $sprinkler=['sprinkler','t','f'];
my $rain=['rain','t','f'];
my $wetgrass=['wetgrass','t','f'];
my $values=[$sprinkler,$couldy,$rain,$wetgrass];
Could be re-written without the intermediate variables as you don't seem to use them again.
my $raValueSets =
[
['cloudy', 't', 'f'],
['sprinkler', 't', 'f'],
['rain', 't', 'f'],
['wetgrass', 't', 'f'],
];
Passing arguments to sub cpt { ... }, I would tend to keep the anonymous arrays as they are and de-reference them later, like this
sub cpt
{
my ($node, $raParents, $raNodeValues) = @_;
...
foreach my $i ( @$raParents )
{
for my $j ( 0 .. $#$raNodeValues )
{
if ( $i eq $raNodeValues->[$j]->[0] )
{
push @nodeindex, $j;
}
}
}
...
}
Note that I use the -> de-reference operator rather than the ${$x}[n] notation, which can quickly become unreadable. In sub parentchildrelationship { ... } you do my $node=$_[0]; #or use "shift" but you can also do
my ($node) = @_;
which puts the LHS in list context and assigns the first element of @_ to the first element of that list, namely $node. Actually, I tend to use shift.
I hope these points are of interest. Cheers, JohnGG | [reply] [d/l] [select] |
Re: Graph model problem.
by Limbic~Region (Chancellor) on May 10, 2007 at 13:54 UTC
|
newbio,
In a number of /msg's, comments in the chatterbox, and a reply to your previous node - I have recommended that you do two things to get better responses.
- A visual representation of a simple example
- The desired output for that example
I know this must be frustrating for you but, in my opinion, you still coming up short. For instance, here is one way you may have provided a visual representation of your network (which really isn't a tree):
While you did provide the desired output, there is no explanation as to what it means. For instance, I would have probably labeled the output much better:
You likely have a relatively simple mistake as indicated by the other replies in this node but that isn't easy for me to see. In fact, when I run your code I do not even get the output you claim it produces (array refs stringified in quotes).
Your description of your problem does not make your intended results clear and your code is also not easy to digest. I could take the time to disect your problem and rewrite your code but that is what you should have done.
| [reply] [d/l] [select] |
|
|