A step in the right direction:
By tinkering a little with your code, I've got it learning the data set at least some of the time, as follows:
Epoch = 40094 error = 0.0188258375344725
Epoch = 40095 error = 0.0188239027473993
1.99620990178657
999.999223878912
2.99174991748112
3.99621594182089
39.9963246624386
99.9965058634682
100.085765949921
199.996807865184
299.9971098669
19.9962642620954
29.9962944622671
20.0141162793859
(Compare with the code below).
What have I changed? Here's the code as it stands at the moment:
use AI::NNFlex::Backprop;
use AI::NNFlex::Dataset;
my $network = AI::NNFlex::Backprop->new(
learningrate=>.00000001,
fahlmanconstant=>0,
momentum=>0.4,
bias=>1);
$network->add_layer( nodes=>2,
activationfunction=>"linear");
$network->add_layer( nodes=>2,
activationfunction=>"linear");
$network->add_layer( nodes=>1,
activationfunction=>"linear");
$network->init();
# Taken from Mesh ex_add.pl
my $dataset = AI::NNFlex::Dataset->new([
[ 1, 1 ], [ 2 ],
[ 500, 500 ], [ 1000 ],
[ 1, 2 ], [ 3 ],
[ 2, 2 ], [ 4 ],
[ 20, 20 ], [ 40 ],
[ 50, 50 ], [ 100 ],
[ 60, 40 ], [ 100 ],
[ 100, 100 ], [ 200 ],
[ 150, 150 ], [ 300 ],
[ 10, 10 ], [ 20 ],
[ 15, 15 ], [ 30 ],
[ 12, 8 ], [ 20 ],
]);
my $err = 10;
# Stop after 4096 epochs -- don't want to wait more than that
for ( my $i = 0; ($err > 0.001) && ($i < 40096); $i++ ) {
$err = $dataset->learn($network);
print "Epoch = $i error = $err\n";
}
foreach (@{$dataset->run($network)})
{
foreach (@$_){print $_}
print "\n";
}
# foreach my $a ( 1..10 ) {
# foreach my $b ( 1..10 ) {
# my($ans) = $a+$b;
# my($nnans) = @{$network->run([$a,$b])};
# print "[$a] [$b] ans=$ans but nnans=$nnans\n" unless $ans == $nn
+ans;
# }
# }
The alterations are:
- The tanh activation function doesn't play nicely with numbers > 1, so I've changed all the layers to a linear activation function.
- The numbers are quite large, so I've set the learning rate very very small.
- I've taken out the fahlman constant - it's difficult to say what that will do with linear activation, but I'd be surprised if it was anything good.
- I've changed the order in which the data items are presented - putting 500+500 directly after 1+1. That was a bit of a hunch, but seemed to improve matters. Possibly because you then get large changes in weights together rather than in different places in the epoch, which will make things unstable.
- I changed the max number of epochs to 40096, because it was trending towards a solution but not reaching it in time.
I'll carry on looking at this - I've never really used this code for non binary represented data, so there will almost certainly be improvements that can be made. Looking at NeuralNet-Mesh, it learns this data set
very quickly, so there may be something I can derive from looking at that code. But at least you can now derive and save a weight set that
will do additions (although you might have to interrupt and restart a few times to get a good, quick run).
Its likely that it can be improved by:
- Altering the range of starting positions (with randomweights=>MAXIMUM STARTING VALUE, perhaps set to 20 to start)
- Experimenting a little more with (probably smaller) values for learningrate & momentum
- Changing the order of the dataset to orders of magnitude (answer = 10, answer=20, answer=100 etc)
I'll post again on this thread if I find anything really useful.
Update: Gah! fahlman constant is the default position. I've amended the code to set the fahlman constant to 0, that seems to work better.
--------------------------------------------------------------
g0n, backpropagated monk
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.