I understand the one node for the output layer, but I can't use a linear activation function for a backprop network because that uses the derivative of the activation function for propogating the error back through the network. Sample error progogation code:
for (out = 0; out < network.size.output; out++) {
network.error.output[out]
= (network.neuron.target[out] - network.neuron.output[out]
+)
* sigmoid_derivative(network.neuron.output[out]);
}
That fails because the derivative of a linear function will be 1.0, thus not allowing the network to learn from errors. Am I missing something basic here?
|