Dr.
Qadri Hamarsheh
Supervised Learning in Neural Networks
(Part 2)
Multilayer neural networks (back-propagation training algorithm)
The input signals are propagated in a forward direction on a layer-by-
layer basis.
Learning in a multilayer network proceeds the same way as for a
perceptron.
A training set of input patterns is presented to the network.
The network computes its output pattern, and if there is an error - or in
other words a difference between actual and desired output patterns - the
weights are adjusted to reduce this error.
In a back-propagation neural network, the learning algorithm has two
phases.
o First, a training input pattern is presented to the network input layer.
The network propagates the input pattern from layer to layer until the
output pattern is generated by the output layer.
o Second, if this pattern is different from the desired output, an error is
calculated and then propagated backwards through the network from
the output layer to the input layer. The weights are modified as the
error is propagated.
back-propagation neural network
1
Dr. Qadri Hamarsheh
The back-propagation training algorithm
Backpropagation is a common method for training a neural network, the
goal of backpropagation is to optimize the weights so that the neural
network can learn how to correctly map arbitrary inputs to outputs.
Backpropagation method contains the following steps:
Step 1: Initialization; set all the weights and threshold levels of the
network to random numbers uniformly distributed inside a range:
Where Fi is the total number of inputs of neuron i in the network.
Step 2: Activation; activate the back-propagation neural network by
applying inputs 𝐱 𝟏 (𝐩), 𝐱 𝟐 (𝐩), … , 𝐱 𝐧 (𝐩) and desired outputs
𝐲𝐝,𝟏 (𝐩), 𝐲𝐝,𝟐 (𝐩), … , 𝐲𝐝,𝐧 (𝐩) (forward pass).
o Calculate the actual outputs of the neurons in the hidden layer:
Where n is the number of inputs of neuron j in the hidden layer.
o Calculate the actual outputs of the neurons in the output layer:
Where m is the number of inputs of neuron k in the output layer
Step 3: Weight training (back-propagate)
o Update the weights in the back-propagation network propagating
backward the errors associated with output neurons.
Calculate the error gradient for the neurons in the output layer:
Where
Calculate the weight corrections:
Update the weights at the output neurons:
o Calculate the error gradient for the neurons in the hidden layer:
2
Dr. Qadri Hamarsheh
Calculate the weight corrections:
Update the weights at the hidden neurons:
Step 4: Iteration; increase iteration p by one, go back to Step 2 and repeat
the process until the selected error criterion is satisfied.
A Step by Step Backpropagation Example: Exclusive-OR.
The initial weights and threshold levels are set randomly as follows:
w13 = 0.5, w14 = 0.9, w23 = 0.4, w24 = 1.0, w35 = -1.2, w45 = 1.1,
𝜽3 = 0.8, 𝜽4 = -0.1 and 𝜽5 = 0.3.
We consider a training set where inputs x1 and x2 are equal to 1 and
desired output yd,5 is 0.
The actual outputs of neurons 3 and 4 in the hidden layer are calculated as
Now the actual output of neuron 5 in the output layer is determined as:
Thus, the following error is obtained:
3
Dr. Qadri Hamarsheh
The next step is weight training. To update the weights and threshold
levels in our network, we propagate the error, e, from the output layer
backward to the input layer.
First, we calculate the error gradient for neuron 5 in the output layer:
Then we determine the weight corrections assuming that the learning rate
parameter, 𝜶, is equal to 0.1:
Next we calculate the error gradients for neurons 3 and 4 in the hidden
layer:
We then determine the weight corrections:
At last, we update all weights and threshold:
The training process is repeated until the sum of squared errors is less
than 0.001.
4
Dr. Qadri Hamarsheh
Learning curve for operation Exclusive-OR
Final results of network learning:
Network represented by McCulloch-Pitts model for solving the Exclusive-
OR operation.
5
Dr. Qadri Hamarsheh
Decision boundaries
(a) Decision boundary constructed by hidden neuron 3;
(b) Decision boundary constructed by hidden neuron 4;
(c) Decision boundaries constructed by complete network
Problems with Backpropagation
1) “Local Minima”: This occurs because the algorithm always changes the
weights in such a way as to cause the error to decrease. But the error
might briefly have to increase, as shown in figure. If this is the case, the
algorithm will “gets stuck” and the error will not decrease further.
o Solutions to this problem:
Reset the weights to different random numbers and train again.
Add “momentum” to the weight correction: weight correction
depends not just on the current error, but also on previous changes,
For example
+
W =W+Current change+(Change on previous iteration * constant)
Constant is < 1.
𝟎 ≤ 𝜷 < 𝟏; Typically 𝜷 = 𝟎. 𝟗𝟓
This equation is called the generalized delta rule.
Learning with momentum for operation Exclusive-OR: 126 Epochs
2) Biological neurons do not work backward to adjust the synaptic weights,
so Backpropagation Algorithm can’t emulate brain-like learning.
6
Dr. Qadri Hamarsheh
3) In Backpropagation Algorithm, calculations are extensive (training is
slow).
o Solutions to this problem:
Use the sigmoidal activation function (hyperbolic tangent).
where a and b are constants.
Include a momentum term.
Adjust the learning rate parameter during training (not constant).
Learning with adaptive learning rate: 103 Epochs.
Learning with momentum and adaptive learning rate: 85 Epochs.
Many variations on the standard Backpropagation Algorithm have been
developed over the years to overcome such problems.