The delta rule
An important genteralisiation of the perceptron training algorithm, presented by Widrow and Hoff as the "least mean square' (LMS) learning procedure, extends this technique to continuous inputs and outputs. The LMS procedure, also known as the delta rule, has been applied most often with purely linear output units.
The LMS procedure finds the values of all the weights that minimise the error function by a method called gradient descent. The idea is to make a change in the weight proportional to the negative of the derivative of the error as messured on the current pattern with respect to each weight.[Ben J. A., P. Patrick]
The delta rule is limited by having only two layers of processing units (and one layer of weights). The computational abilities of neural networks with nonlinear processing units can be extended by including layers that intervene between input and output. (Is this also true for multi-layered networks of linear processing units? Why or why not?). The problem with multi-layer networks is how to assign error to units in intermediate (hidden) layers, for which target values do not exist and indeed cannot be known a priori.
Delta rule and the perceptron learning rule to the same training data will generally lead to different results.
Delta Rule 1, 2,
Delta Rule1,
2,
deta learning rule,
1,
2,
3,
4,
5,
6,
7 erro-backpropagation algorithm 1,
2,
3,
4,
5,
6,
7,
8,
9,
10 IF neuron 1,
NEURAL NETWORK MODELS