backpropagation accuracy issue

perlchris has asked for the wisdom of the Perl Monks concerning the following question:

#!/usr/bin/perl -w
use strict; use warnings;

my $numberofinputnodes = 2; my $numberofhiddennodes = 2; my $learningr
+ate = .1;
my $maxepochs = 10; my $errormargin = .1; my @weight = (.6,.6,.6);
my @hiddenweight = (.6,.6,.6);
my @data = (["-1","0.000000","0.000000","0"],["-1","0.000000","1.00000
+0","1"],
["-1","1.000000","0.000000","1"],["-1","1.000000","1.000000","0"]);
my $numberofepochs = 1; my $net = 0; my $netk = 0; my $avgrmse = 0; my
+ @outputh = ();
my @outputk = (); my $deltak = 0; my $deltah = 0; my @rmse = (); my $m
+axrmse =0;

while (($numberofepochs <= $maxepochs) && (($avgrmse == 0) || ($avgrms
+e >= $errormargin))){
    $avgrmse = 0;
    for (my $j=0; $j <= $#data; $j++) {
        for (my $k=0; $k < $#{$data[0]}; $k++){
            $net = $net + $weight[$k]*$data[$j][$k];
        }
        $outputh[$j] = 1 / (1 + exp(-$net));
        $net=0;
        for (my $o = 0; $o <= $numberofhiddennodes; $o++){
            if ($o==0){
                $netk = -$hiddenweight[0];    
            }
            $netk = $netk + $hiddenweight[$o]*$outputh[$j];
        }
        $outputk[$j] = 1 / (1 + exp(-$netk)); $netk=0;
        $deltak = $outputk[$j]*(1 - $outputk[$j])*($data[$j][-1] - $ou
+tputk[$j]);
        my @deltah;
        for (my $o = 0; $o < $#{$data[0]}; $o++){
            $deltah[$o] = $outputh[$j]*(1-$outputh[$j])*$weight[$o]*$d
+eltak;
        }
        for (my $m=0; $m < $#{$data[0]}; $m++){
            my $update = $learningrate*$deltah[$m]*$data[$j][$m];
            $weight[$m] = $weight[$m] + $update;
        }
        for (my $n=0; $n <= $numberofhiddennodes; $n++){
            my $update = $learningrate*$deltak*$outputk[$j];
            $hiddenweight[$n] = $hiddenweight[$n] + $update;
        }
        $deltak=0; @deltah = ();
    } 
    for (my $j=0; $j <= $#data; $j++) {
        for (my $k=0; $k < $#{$data[0]}; $k++){
            $net = $net + $weight[$k]*$data[$j][$k];
        }
        $outputh[$j] = 1 / (1 + exp(-$net));
        for (my $o = 1; $o < $numberofhiddennodes; $o++){
            $netk = $netk + $hiddenweight[$o]*$outputh[$j];
        }
        $netk = $netk - $hiddenweight[0]; $outputk[$j] = 1 / (1 + exp(
+-$netk));
        $net = 0; $netk = 0;
    }
    my $totalrmse = 0;
    for my $q (0..$#data){
        $rmse[$q] = sqrt(($data[$q][-1] - $outputk[$q])**2);
        $totalrmse = $totalrmse + $rmse[$q];
    }
    $avgrmse = $totalrmse / ($#data +1); @rmse = sort( {$a <=> $b} @rm
+se); 
    $maxrmse = $rmse[-1];
    if($numberofepochs==$maxepochs){
        print "\n","***** Epoch $numberofepochs *****", "\n";
        print "Maximum RMSE: $maxrmse", "\n";
        print "Average RMSE: $avgrmse", "\n";
    }
    @rmse = (); $totalrmse = 0;    $numberofepochs++;
}
exit;
[download]

The final output looks like this:

***** Epoch 10 *****
Maximum RMSE: 0.570149294921984
Average RMSE: 0.500071929555965
[download]

The output should be this

***** Epoch 10 *****
Maximum RMSE: 0.5432215742673802
Average RMSE: 0.4999911891911738
[download]

Which shows its behaving with some precision, but it isn't accurate and I have no idea why. The values move in the right direction, but something seems wrong with exactly how much the values change.

I'm trying to implement backpropagation with stochastic gradient descent (this example just uses fixed initial weights instead of random ones) for a Machine Learning course.

I've been able to get the final output incrementally closer to the correct answer over several revisions, but I haven't been able to improve it anymore. So I think additional sets of eyes might be useful.

I'd like to think this code shows I understand how the algorithm works and that the problem is boiling down to some sort of scoping issue or aggregate error that I'm not seeing, but I don't know. Thus I humbly seek Perl Wisdom from the Monks! Thank you in advance for any suggestions.

Comment on backpropagation accuracy issue Select or Download Code

Replies are listed 'Best First'.
Re: backpropagation accuracy issue by GrandFather (Saint) on Feb 19, 2011 at 02:57 UTC
As a quick test I added `use bignum;` (see bignum) to the test script and obtained essentially the same erroneous values to greater precision so I'd guess the errors are unlikely to be due to rounding issues or accumulated errors. That implies that the algorithm has an error somewhere. How did you obtain the "correct" values? Can you provide a reference to a public online resource that describe the algorithm? True laziness is hard work	[reply] [d/l]
Re^2: backpropagation accuracy issue by perlchris (Initiate) on Feb 19, 2011 at 03:11 UTC
The correct values were provided and are posted here The algorithm is described in Tom M. Mitchell's "Machine Learning" (p. 98) there are pdf slides here slides numbered 88-92 on pages 4 and 5 respectively. That's unfortunate that its not a rounding error I will comb through more closely and hopefully find where the problem is. If you have any suggestions it is certainly appreciated. Thank you for your help so far.	[reply]
Re^3: backpropagation accuracy issue by ikegami (Patriarch) on Feb 19, 2011 at 03:35 UTC
@hiddenweight after one epoch: `$VAR1 = [ '0.596062038434604', '0.596062038434604', '0.596062038434604' ];` [download] After two: `$VAR1 = [ '0.592154093746109', '0.592154093746109', '0.592154093746109' ];` [download] Shouldn't they be different? (Just guessing here.)	[reply] [d/l] [select]
Re^4: backpropagation accuracy issue by perlchris (Initiate) on Feb 19, 2011 at 03:44 UTC
Re^3: backpropagation accuracy issue by ikegami (Patriarch) on Feb 19, 2011 at 03:21 UTC
Doesn't even start the same. Him: `*** Epoch 1 * Maximum RMSE: 0.5435466682137927 Average RMSE: 0.4999991292217466` [download] You: `* Epoch 1 *** Maximum RMSE: 0.574001103043358 Average RMSE: 0.50006970432383` [download]	[reply] [d/l] [select]
Re^2: backpropagation accuracy issue by ikegami (Patriarch) on Feb 19, 2011 at 03:11 UTC
Further support that this is not a floating pointing point issue: The max is correct to one digit and the average is correct to 4 digits, but doubles have about 17 digits (53 bits) of precision.	[reply]
Re^3: backpropagation accuracy issue by perlchris (Initiate) on Feb 19, 2011 at 03:17 UTC
If I could get both to 6 digits (or perhaps more) it'd be acceptable. The program that made the output to test against though is not written in Perl. I don't know if the casts are implemented significantly different in different languages, but I believe the other program was written in Java. We were allowed to write in any language and I thought Perl would be a good way to do it since that's what I've been reading on as of late. Again thank you for the help and suggestions. They are very appreciated. I will continue to look through and see if I can identify the problem with my implementation.	[reply]