tlpriest has asked for the wisdom of the Perl Monks concerning the following question:

I'm looking to do some financial analysis with neural networks. I'm casually familiar with neural nets but I'm not a mathematician and don't want to delve into their theory more than is necessary to make them work and to validate the results.

That said, I'm not able to get the examples to run cleanly on my system. I first downloaded AI-NeuralNets-Mesh and ran the ex_add, ex_sub, and ex_mult examples. Add worked great (and I was very excited by the possibilities) as did xor, but subtract and multiply gave grossly wrong answers. Looking more closely, I realized this package has not been supported in a long time.

I then downloaded NNFlex. The xor demo example works fine, but I was unable to modify it to make add, subtract, or multiply work, and also the cars demo did not work. Specifics:

When modifying xor to do add, subtract, or multiply as in the Mesh examples, if I add more than three training rules, the learning process gets stuck on the same error value after the third or fourth iteration and loops on that same value indefinitely. If I limit the iterations to something like 1024 and prematurely end learning, I get the same wrong result for every input.

This is the same behavior for the cars example provided with NNFlex. I let it run overnight to make sure I wasn't being impatient. In learning, Epoch 0 returned 1.#INF and Epoch 1 returned 2.69464268337377e+024 and that's where it stayed for 26,131 epochs while consuming 100% of the CPU for about 8 hours. That's the same type of behavior I was getting when trying to make simple modifications to the xor example.

Some system details:

perl -v This is perl, v5.8.6 built for MSWin32-x86-multi-thread Binary build 811 provided by ActiveState Corp.

NNFlex-0.23

I can not use NNEasy as I do not have VC++, so I can't compile the inline C code even though I have gcc under Cygwin. (Tried to fake it by editing the Makefile in the temporary directory and fixing the compiler switches, but no joy).

Other than the expected fork and OS memory management issues, I've had good luck with perl on windows. I have several gigabytes of data and some automated update routines on this platform, so I'd rather not move it to Linux if I can help it.

Additionally, on Fedora core release 2 with perl, v5.8.3 built for i386-linux-thread-multi, cars produces Epoch 0: Error = nan then nannan for some large number of iterations before stopping.

On linux, xor_minimal runs 5322 epochs then returns the right result. On w32, it runs 86 epochs then returns correct results.

Travis

PS: I have some theoretical questions about neural nets and data correlation that don't fit here on the perl discussion boards. Is there an appropriate place for these questions? From what I can tell, so far NNFlex is used mostly for academic purposes. If I can bend it to do some analysis that I'm interested in, I don't mind contributing back example code and HOWTO's for inclusion in a later release.

Replies are listed 'Best First'.
Re: NNflex problems (win32)
by g0n (Priest) on May 13, 2005 at 12:36 UTC
    Can you post some example code from what you were doing with NNFlex? It sounds like you've probably got an unsolvable dataset.

    I'm surprised you're having a problem with the cars example, as that is usually very stable. Did you run it more than once? Backprop nets are very much dependent on the starting position, which is randomised, so it is possible for one run to fail and the next to work perfectly. This is also why the same network can take thousands of epochs to learn, or a few tens - it's not related to running on w32 or linux.

    You'd probably find the same things happening with NNEasy anyway, as the algorithm was derived from an early version of NNFlex.

    For discussion of neural net theory, try comp.ai.neural-nets.

    (and if you've mailed the contact email address on the NNFlex docs and not had an answer, sorry. I can only get the mail from that account at weekends).

    Update: Ouch! Just checked on my source. The cars example has a line

    errorfunction=>'atanh',

    that should not be there. It's an experimental approach that makes the network much faster on the rare occasions it converges, but very unstable! Remove that line, and you should find cars.pl runs OK.

    --------------------------------------------------------------

    g0n, backpropagated monk

      Ah! I have very little neural net experience and didn't realize the learning process was not strictly deterministic but had a random element. I see that now by running xor and xor_minimal several times sequentially: each run had a different number of learning epochs but produced the same result.

      There's something that troubles me, though. The learning process seems to occasionally go on infinitely. For example, my last xor_minimal run went for 67,481 epochs before I decided to stop it. I ran xor about 8 times, all with epochs under 256, but the last time I ran it I let it go for a few minutes up to epoch 37892 before I killed it -- it got stuck on an error value of 5.30893529204799. Is it possible that the learning process even for small nets will never show an error level less than .001 depending on the random initialization?

      I was awed by the implications of the ex_add example in the Mesh package. If it's possible to "teach" a system to add, subtract, or do other basic math by example with reasonable accuracy, then I have pretty high expectations for my data. So, right now, I'm trying to prove to myself that I can "teach" a net to do basic operations -- things that I can verify independently and easily. When I set it loose on my data, after the initial tweaking and verifying, it's going to get more and more expensive (in terms of manual labor) for me to verify all of the results so I'd like some confidence that I understand what it's doing up front.

      Your suggested modification, removing atanh as the errorfunction, seems to work on Linux. It's on Epoch 400 with an error around .35, which has steadily been dropping from the initial error level of about 200. I'll let it run a little longer.

      With that said, I've taken your xor example and modified it with the ex_add example from the Mesh package. Here's the resulting code:

      use AI::NNFlex::Backprop; use AI::NNFlex::Dataset; my $network = AI::NNFlex::Backprop->new( learningrate=>.2, bias=>1, fahlmanconstant=>0.1, momentum=>0.6, round=>1); $network->add_layer( nodes=>2, activationfunction=>"tanh"); $network->add_layer( nodes=>2, activationfunction=>"tanh"); $network->add_layer( nodes=>1, activationfunction=>"linear"); $network->init(); # Taken from Mesh ex_add.pl my $dataset = AI::NNFlex::Dataset->new([ [ 1, 1 ], [ 2 ], [ 1, 2 ], [ 3 ], [ 2, 2 ], [ 4 ], [ 20, 20 ], [ 40 ], [ 50, 50 ], [ 100 ], [ 60, 40 ], [ 100 ], [ 100, 100 ], [ 200 ], [ 150, 150 ], [ 300 ], [ 500, 500 ], [ 1000 ], [ 10, 10 ], [ 20 ], [ 15, 15 ], [ 30 ], [ 12, 8 ], [ 20 ], ]); my $err = 10; # Stop after 4096 epochs -- don't want to wait more than that for ( my $i = 0; ($err > 0.001) && ($i < 4096); $i++ ) { $err = $dataset->learn($network); print "Epoch = $i error = $err\n"; } foreach (@{$dataset->run($network)}) { foreach (@$_){print $_} print "\n"; } print "this should be 1 - ".@{$network->run([0,1])}."\n"; # foreach my $a ( 1..10 ) { # foreach my $b ( 1..10 ) { # my($ans) = $a+$b; # my($nnans) = @{$network->run([$a,$b])}; # print "[$a] [$b] ans=$ans but nnans=$nnans\n" unless $ans == $nn +ans; # } # }
        To evaluate the learning progress of a network, you need to know a little more about how neural nets find solutions.

        Imagine the space of all possible solutions (from perfect to awful) as a 3D landscape. The altitude is the error value, the coordinates are internal and external values in the NN sim. Training the neural net is something like a marble rolling along this terrain, tending to roll downhill. Sometimes the marble will stop at the bottom of a sinkhole on a mesa, nowhere near the global minimum error value.

        The odd random kick may send the search out of the local minimum toward a better solution. The size of the random kick may be changed over time, such that later kicks tend to be smaller. Multiple training sessions may be run, and the "mean training time" to a certain error limit computed. (Neural nets also benefit from having noisy connections, even after training is complete. )

        A single training run may fail to meet the error spec, even if run forever.

        Some problem spaces may also have a fractal or chaotic solution space -- slight changes to the starting conditions can drastically alter the solution found.

        -QM
        --
        Quantum Mechanics: The dreams stuff is made of

        A step in the right direction:

        By tinkering a little with your code, I've got it learning the data set at least some of the time, as follows:

        Epoch = 40094 error = 0.0188258375344725 Epoch = 40095 error = 0.0188239027473993 1.99620990178657 999.999223878912 2.99174991748112 3.99621594182089 39.9963246624386 99.9965058634682 100.085765949921 199.996807865184 299.9971098669 19.9962642620954 29.9962944622671 20.0141162793859

        (Compare with the code below).

        What have I changed? Here's the code as it stands at the moment:

        use AI::NNFlex::Backprop; use AI::NNFlex::Dataset; my $network = AI::NNFlex::Backprop->new( learningrate=>.00000001, fahlmanconstant=>0, momentum=>0.4, bias=>1); $network->add_layer( nodes=>2, activationfunction=>"linear"); $network->add_layer( nodes=>2, activationfunction=>"linear"); $network->add_layer( nodes=>1, activationfunction=>"linear"); $network->init(); # Taken from Mesh ex_add.pl my $dataset = AI::NNFlex::Dataset->new([ [ 1, 1 ], [ 2 ], [ 500, 500 ], [ 1000 ], [ 1, 2 ], [ 3 ], [ 2, 2 ], [ 4 ], [ 20, 20 ], [ 40 ], [ 50, 50 ], [ 100 ], [ 60, 40 ], [ 100 ], [ 100, 100 ], [ 200 ], [ 150, 150 ], [ 300 ], [ 10, 10 ], [ 20 ], [ 15, 15 ], [ 30 ], [ 12, 8 ], [ 20 ], ]); my $err = 10; # Stop after 4096 epochs -- don't want to wait more than that for ( my $i = 0; ($err > 0.001) && ($i < 40096); $i++ ) { $err = $dataset->learn($network); print "Epoch = $i error = $err\n"; } foreach (@{$dataset->run($network)}) { foreach (@$_){print $_} print "\n"; } # foreach my $a ( 1..10 ) { # foreach my $b ( 1..10 ) { # my($ans) = $a+$b; # my($nnans) = @{$network->run([$a,$b])}; # print "[$a] [$b] ans=$ans but nnans=$nnans\n" unless $ans == $nn +ans; # } # }

        The alterations are:

        • The tanh activation function doesn't play nicely with numbers > 1, so I've changed all the layers to a linear activation function.
        • The numbers are quite large, so I've set the learning rate very very small.
        • I've taken out the fahlman constant - it's difficult to say what that will do with linear activation, but I'd be surprised if it was anything good.
        • I've changed the order in which the data items are presented - putting 500+500 directly after 1+1. That was a bit of a hunch, but seemed to improve matters. Possibly because you then get large changes in weights together rather than in different places in the epoch, which will make things unstable.
        • I changed the max number of epochs to 40096, because it was trending towards a solution but not reaching it in time.
        I'll carry on looking at this - I've never really used this code for non binary represented data, so there will almost certainly be improvements that can be made. Looking at NeuralNet-Mesh, it learns this data set very quickly, so there may be something I can derive from looking at that code. But at least you can now derive and save a weight set that will do additions (although you might have to interrupt and restart a few times to get a good, quick run).

        Its likely that it can be improved by:

        • Altering the range of starting positions (with randomweights=>MAXIMUM STARTING VALUE, perhaps set to 20 to start)
        • Experimenting a little more with (probably smaller) values for learningrate & momentum
        • Changing the order of the dataset to orders of magnitude (answer = 10, answer=20, answer=100 etc)

        I'll post again on this thread if I find anything really useful.

        Update: Gah! fahlman constant is the default position. I've amended the code to set the fahlman constant to 0, that seems to work better.

        --------------------------------------------------------------

        g0n, backpropagated monk

Re: NNflex problems (win32)
by g0n (Priest) on Jun 16, 2005 at 15:26 UTC
    FWIW I've finally had a bit of time to revisit this. Sorry its taken so long. The following code works fairly nicely - it learns very quickly, and generalises rather nicely. The problem appears to be that with a small learning constant (needed for large numbers) the network takes for ever to learn. With a large learning constant, the numbers rapidly get out of range.

    In addition, bias should not be used (this is only needed for networks where zero activation levels may require true outputs) and the starting weights have been fixed to 1, to keep them as positive numbers.

    use AI::NNFlex::Backprop; use AI::NNFlex::Dataset; my $network = AI::NNFlex::Backprop->new( learningrate=>.00001, fahlmanconstant=>0, fixedweights=>1, momentum=>0.3, bias=>0); $network->add_layer( nodes=>2, activationfunction=>"linear"); $network->add_layer( nodes=>2, activationfunction=>"linear"); $network->add_layer( nodes=>1, activationfunction=>"linear"); $network->init(); # Taken from Mesh ex_add.pl my $dataset = AI::NNFlex::Dataset->new([ [ 1, 1 ], [ 2 ], [ 1, 2 ], [ 3 ], [ 2, 2 ], [ 4 ], [ 20, 20 ], [ 40 ], [ 10, 10 ], [ 20 ], [ 15, 15 ], [ 30 ], [ 12, 8 ], [ 20 ], ]); my $err = 10; # Stop after 4096 epochs -- don't want to wait more than that for ( my $i = 0; ($err > 0.0001) && ($i < 4096); $i++ ) { $err = $dataset->learn($network); print "Epoch = $i error = $err\n"; } foreach (@{$dataset->run($network)}) { foreach (@$_){print $_} print "\n"; } print "this should be 4000 - "; $network->run([2000,2000]); foreach ( @{$network->output}){print $_."\n";} foreach my $a ( 1..10 ) { foreach my $b ( 1..10 ) { my($ans) = $a+$b; my($nnans) = @{$network->run([$a,$b])}; print "[$a] [$b] ans=$ans but nnans=$nnans\n" unless $ans == $nna +ns; } }

    --------------------------------------------------------------

    g0n, backpropagated monk