0% found this document useful (0 votes)

5 views42 pages

PDC Lecture 12

Uploaded by

Aqib khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views42 pages

PDC Lecture 12

Uploaded by

Aqib khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

CS-402 Parallel and Distributed Systems

Spring 2025

Lecture No. 12
Artificial Neural Networks
 what is artificial neural network ANN?
 An Artificial Neural Network (ANN) is a computational model inspired by the way biological neural
networks in the human brain process information. ANNs are designed to recognize patterns and
interpret data through a series of layers that simulate how neurons in the brain communicate with
each other.
 Here’s a detailed breakdown:
 Key Components Neurons (Nodes): Basic units that receive input, process it, and pass it on. Similar to
biological neurons, they apply an activation function to the input.
 Layers:
 Input Layer: Receives the initial data.
 Hidden Layers: Intermediate layers where data is transformed and processed.
 Output Layer: Produces the final result or prediction.
How ANN works?
 Input Data: Data is fed into the input layer.

 Weight and Bias: Each connection between neurons has a weight and each neuron has a bias, both
of which are adjusted during training.

 Activation Function: Neurons apply an activation function (like ReLU, Sigmoid, or Tanh) to the input
data to decide if it should be passed to the next layer.

 Forward Propagation: Data moves through the network from input to output, being processed and
transformed along the way.

 Output: The output layer provides the final prediction or classification result.
Training an ANN
 Backpropagation: A technique used to train ANNs by adjusting weights and biases to
minimize the difference between the predicted output and the actual output (known as the
loss function).

 Epochs: One full cycle through the training dataset.

 Learning Rate: A parameter that controls how much the weights and biases are adjusted
during training.
Applications
 Image and Speech Recognition: Used in facial recognition systems, voice-activated
assistants, and more.

 Natural Language Processing (NLP): Powers applications like language translation and
sentiment analysis.

 Medical Diagnosis: Assists in predicting diseases and analyzing medical images.

 Financial Forecasting: Helps in stock market predictions and risk management.

Example
 Imagine training an ANN to recognize handwritten digits. The input layer receives pixel
values of images, hidden layers process these values, and the output layer predicts the digit
(0-9).

 ANNs are incredibly powerful and versatile tools for solving a wide range of problems where
traditional algorithms might struggle.
What do (deep) neural networks do?

 Learning (highly) non-linear functions.

⨁
0 0 0
(0, 1) (1, 1)
0 1 1
1 0 1
1 1 0
(0, 0) (1, 0)
Logic XOR (⨁) operation
Artificial neural network example

 A neural network consists of

Input Hidden Hidden Hidden Output
layer layer 1 layer 2 layer 3 layer layers of artificial neurons and
connections between them.
 Each connection is associated
with a weight.
 Training of a neural network is
to get to the right weights (and
biases) such that the error
across the training data is
minimized.
Training a neural network
 A neural network is trained with m training samples
( ) ( ) ( ) ( ) ( ) ( )
( , ), ( , ), ……( , )
() is an input vector, () is an output vector
 Training objective: minimize the prediction error (loss)

( − ( ))

()
( ) is the predicted output vector for the input vector
 Approach: Gradient descent (stochastic gradient descent, batch gradient descent, mini-
batch gradient descent).
o Use error to adjust the weight value to reduce the loss. The adjustment amount is proportional to the
contribution of each weight to the loss – Given an error, adjust the weight a little to reduce the error.
Stochastic gradient descent
() ()
 Given one training sample ( , )
 Compute the output of the neural network ( )
 Training objective: minimize the prediction error (loss) – there are different ways to
define error. The following is an example:
1
= ( − ( ))
2
 Estimate how much each weight in contributes to the error:

 Update the weight by = −α . Here α is the learning rate.

Algorithm for learning artificial neural network

 Initialize the weights =[ , ,……, ]

 Training
o For each training data ( , ( ) ), Using forward propagation to compute the neural
network output vector ( )
o Compute the error (various definitions)

o Use backward propagation to compute for each weight

o Update = −α
o Repeat until E is sufficiently small.
A single neuron
b
()
w1
Neuron

() wm
() () Activation function
1∗ + ⋯+ ∗ +b

 An artificial neuron has two components: (1) weighted sum and

activation function.
o Many activation functions: Sigmoid, ReLU, etc.
Sigmoid function
•Shape: S-shaped curve.

•Range: Outputs values between 0 and 1.

•Usage: Often used in binary classification problems.

•Behavior:

• Small input values (negative) result in outputs close to 0.

• Large input values (positive) result in outputs close to 1.

• Middle input values result in outputs around 0.5.

Example: Imagine a light dimmer switch that smoothly transitions from off (0) to fully on (1).
Sigmoid function

σ = =

 The derivative of the sigmoid

function: =
1−

σ = σ = σ (1 − σ )
ReLU (Rectified Linear Unit)
•Shape: Linear for positive values, flat for negative values.

•Range: Outputs values between 0 and infinity.

•Usage: Commonly used in hidden layers of neural networks.

•Behavior:

• Negative input values result in outputs of 0.

• Positive input values result in outputs equal to the input value.

Example: Think of a water tap that only lets water flow when turned on (positive
values), but stops completely when turned off (negative values).
Tanh (Hyperbolic Tangent)
•Shape: S-shaped curve, similar to Sigmoid but steeper.

•Range: Outputs values between -1 and 1.

•Usage: Often used in hidden layers of neural networks.

•Behavior:

•Small input values (negative) result in outputs close to -1.

•Large input values (positive) result in outputs close to 1.

•Middle input values result in outputs around 0.

•Example: Imagine a seesaw that smoothly transitions from one side (-1) to the other
side (1).
Comparison

•Sigmoid: Good for outputs that need to be between 0 and 1, but can suffer from
vanishing gradients (slow learning).

•ReLU: Efficient and widely used, but can suffer from "dead neurons" (outputs stuck at
0).

•Tanh: Similar to Sigmoid but outputs range from -1 to 1, providing stronger gradients
for learning.
Training for the logic AND with a single neuron
 In general, one neuron can be trained to realize a linear function.
 Logic AND function is a linear function:

⋀
0 0 0
(0, 1) (1, 1)
0 1 0
1 0 0
1 1 1
(0, 0) (1, 0)
Logic AND (⋀) operation
Training for the logic AND with a single neuron
b=0
=0 =0

∑ Sigmoid(s) O=Sigmoid(0)=0.5

=1 =0
Activation function
= + + =0

 Consider training data input ( =0, = 1), output Y=0.

 NN Output = 0.5
 Error: = ( − ) = 0.125

 To update , , and b, gradient descent needs to compute , , and

Chain rules for calculating , , and
b=0
=0 =0
= + + =0
∑ Sigmoid(s) O=Sigmoid(0)=0.5

=1 =0
Activation function

 If a variable z depends on the variable y, which itself depends on the

variable x, then z depends on x as well, via the intermediate variable y.
The chain rule is a formula that expresses the derivative as : =

 =
Training for the logic AND with a single neuron
b=0
=0 =0

∑ Sigmoid(s) O=Sigmoid(0)=0.5

=1 =0
Activation function
= + + =0

( ( ) )
 = = = O − Y = 0.5 − 0 = 0.5
( ) ( + )
 = = sigmoid(s) (1-sigmoid(s)) = 0.5 (1-0.5) = 0.25, = = =0

 To update : = − ∗ = 0 – 0.1*0.5*0.25*0 = 0

 Assume rate = 0.1

Training for the logic AND with a single neuron
b=0
=0 =0

∑ Sigmoid(s) O=Sigmoid(0)=0.5

=1 =0
Activation function
= + + =0

( ( ) )
 = = = O − Y = 0.5 − 0 = 0.5
( ) ( + )
 = = sigmoid(s) (1-sigmoid(s)) = 0.5 (1-0.5) = 0.25, = = =1

 To update : = − ∗ = 0 – 0.10.50.25*1 = -0.0125

Training for the logic AND with a single neuron
b=0
=0 =0

∑ Sigmoid(s) O=Sigmoid(0)=0.5

=1 =0
Activation function
= + + =0

( ( ) )
 = = = O − Y = 0.5 − 0 = 0.5
( ) ( + )
 = = sigmoid(s) (1-sigmoid(s)) = 0.5 (1-0.5) = 0.25, = = 1

 To update b: b= − ∗ = 0 – 0.10.50.25*1 = -0.0125

Training for the logic AND with a single neuron
b=-0.0125
=0 =0

∑ Sigmoid(s) O=Sigmoid(0)=0.5

=1
=-0.0125
Activation function
= + + =0

 This process is repeated until the error is sufficiently small

 The initial weight should be randomized. Gradient descent can get stuck in the local
optimal.
 See lect7/one.cpp for training the logic AND operation with a single neuron.
 Note: Logic XOR operation is non-linear and cannot be trained with one neuron.
Multi-level feedforward neural networks

 A multi-level feedforward neural network is a neural network that

consists of multiple levels of neurons. Each level can have many neurons
and connections between neurons in different levels do not form loops.
o Information moves in one direction (forward) from input nodes, through hidden
nodes, to output nodes.
 One artificial neuron can only realize a linear function
 Many levels of neurons can combine linear functions can train arbitrarily
complex functions.
o One hidden layer (with infinite number of neurons) can train for any continuous
function.
Multi-level feedforward neural networks examples
 A layer of neurons that do not directly connect to outputs is called a
hidden layer.
Input Hidden Hidden Hidden Output
layer layer 1 layer 2 layer 3 layer
Build a 3-level neural network from scratch

 3 levels: Input level, hidden level, output level

o Other assumptions: fully connected between layers, all neurons use sigmoid
(σ ) as the activation function.

 Notations:
o N0: size of the input level. Input: 0 =[ , ,…, ]
o N1: size of the hidden layer
o N2: size of the output layer. Output: OO 2 =[ , ,…, ]
Build a 3-level neural network from scratch

 Notations:
o N0, N1, N2: sizes of the input layer, hidden layer, and output layer, respectively
o N0×N1 weights from input layer to hidden layer. 0 : the weight from input unit i to
hidden unit j. B0[N1] biases. B0 1 = [ 0 , 0 , … , 0 ]
o N1×N2 weights from hidden layer to output layer. 1 : the weight from hidden unit i to
output unit j. B1[N2] biases. B1 2 = [ 1 , 1 , … , 1 ]

0 , ⋯ 0 , 0 , ⋯ 0 ,
o 0 0 1 = ⋮ ⋱ ⋮ , 1 1 2 = ⋮ ⋱ ⋮
0 , ⋯ 0 , 0 , ⋯ 0 ,
3-level feedforward neural network

Output: OO[N2]

Output layer 1 2 N2 Hidden layer biases: B2[N2]

Weight: W1[N1][N2]
Hidden layer
Output: HO[N1]
Hidden layer 1 2 N1 Hidden layer biases: B1[N1]

Weight: W0[N0][N1]

Input layer 1 2 N0
Hidden layer weighted sum: HS[N1]
Input: IN[N0] Output layer weighted sum: HS[N2]
Forward propagation (compute OO and E)

 Compute hidden layer weighted sum: HS 1 =[ , ,…, ]

o = × 0 , + × 0 , + ⋯+ × 0 , + 1
o In matrix form: = × 0+ 1

 Compute hidden layer output: HO 1 =[ , ,…, ]

o = σ( )
o In matrix form: = σ(HS)
Forward propagation

 From input (IN[N0]), compute output (OO[N2]) and error E.

 Compute output layer weighted sum: OS 2 =[ , ,…, ]
o = × 1 , + × 1 , + ⋯+ × 1 , + 2
o In matrix form: = × 1+ 2

 Compute final output: OO 2 =[ , ,…, ]

o = σ( )
o In matrix form: O = σ(OS)

 Let us use mean square error: = ∑ ( − )

Backward propagation

 To goal is to compute , , , and .

, ,

 = , ,…, =[ − , ( −
), … , − ]

 In matrix form: = ( − )
 This can be stored in an array dE_OO[N2];
Backward propagation

 To goal is to compute , , , and .

, ,

 is done

 = , ,…, =

σ( )(1 − σ( )) , … , σ( )(1 − σ( ))

 In matrix form: = ( − )⊙ ⊙ (1 − )
 This can be stored in an array dE_OS[N2];
Backward propagation

 To goal is to compute , , , and .

, ,

 , are done

 = , ,…,

 = × 1 , + × 1 , +⋯+ × 1 , + 2

 Hence, = 1.

 = , ,…, =
Backward propagation

 To goal is to compute , , , and .

, ,

 , , are done
i
1 , 1 ,
⋯
, ,
 = ⋮ ⋱ ⋮
⋯
, ,,

 = × 1 , + × 1 , + ⋯+ × 1 , + 2

 Hence, ,
= .
Backward propagation

 To goal is to compute , , , and .

, ,

 , , are done

⋯ ⋯
, ,
 = ⋮ ⋱ ⋮ = ⋮ ⋱ ⋮
⋯ ⋯
, ,,

 In matrix form: =
Backward propagation

 To goal is to compute , , , and .

, ,

 , , , are done

 =[ , ,……, ]

 = + + …+ = 1, +

1 , + …+ 1,
Backward propagation

 To goal is to compute , , , and .

, ,

 , , , are done

 =[ , ,……, ]= 1
Backward propagation

 To goal is to compute , , , and .

, ,

 , , , , are done

 Once is computed, we can repeat the process for the hidden

layer by replacing OO with HO, OS with HS, B2 with B1 and W2 with
W1, in the differential equation. Also the input is IN[N0] and the
output is HO[N1].
Summary

H1 H2
IN layer O X Layer 1 Layer 2 Layer 3 Y

layer

 The output of a layer is the input of the next layer.

 Backward propagation uses results from forward propagation.
o = , = , =
Training for the logic XOR and AND with a 6-unit 2-level neural network

 Logic XOR function is not a linear function.

 See 3level.cpp

⨁
0 0 0 (0, 1) (1, 1) AND
0 1 1
1 0 1
XOR
1 1 0
(0, 0) (1, 0)
Logic XOR (⨁) operation
Summary

 Briefly discuss multi-level feedforward neural networks

 The training of neural networks

 Following 3level.cpp, one should be able to write a program for any multi-level
feedforward neural networks.

WAP To Implement Artificial Neural Network
No ratings yet
WAP To Implement Artificial Neural Network
13 pages
UNIT III 3.1 ML Artificial Neural Networks
No ratings yet
UNIT III 3.1 ML Artificial Neural Networks
65 pages
NN Introduction MES
No ratings yet
NN Introduction MES
39 pages
Lec1 Inroduction To Neural Network
No ratings yet
Lec1 Inroduction To Neural Network
23 pages
15 Neural Network Updated
No ratings yet
15 Neural Network Updated
85 pages
Machine Learning Unit 5 Notes
No ratings yet
Machine Learning Unit 5 Notes
19 pages
Week 8 - ANN
No ratings yet
Week 8 - ANN
42 pages
AI Neural Networks for Students
No ratings yet
AI Neural Networks for Students
119 pages
Neural Networks Basics & Training
No ratings yet
Neural Networks Basics & Training
8 pages
ML Unit 3 Study Material-1
No ratings yet
ML Unit 3 Study Material-1
32 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
26 pages
Lesson 01
No ratings yet
Lesson 01
6 pages
TO Artificial Neural Networks
No ratings yet
TO Artificial Neural Networks
22 pages
Artificial Neural Networks - MiniProject
100% (1)
Artificial Neural Networks - MiniProject
16 pages
Neural Networks: A Beginner's Guide
No ratings yet
Neural Networks: A Beginner's Guide
37 pages
Lecture15 NeuronNetworks
No ratings yet
Lecture15 NeuronNetworks
61 pages
Introduction to Artificial Neural Networks
No ratings yet
Introduction to Artificial Neural Networks
19 pages
Lect8 DNN
No ratings yet
Lect8 DNN
33 pages
Seminar Artificial Neural Network 24 9
No ratings yet
Seminar Artificial Neural Network 24 9
39 pages
CS 329 Lecture4 2025new
No ratings yet
CS 329 Lecture4 2025new
61 pages
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
No ratings yet
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
31 pages
Machine Learning With Artificial Neural Networks
No ratings yet
Machine Learning With Artificial Neural Networks
6 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
34 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
26 pages
Model of Neuron in An ANN
No ratings yet
Model of Neuron in An ANN
12 pages
Artificial Neural Networks Basics
No ratings yet
Artificial Neural Networks Basics
50 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
31 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
CS460 - Deep Learning - W02 & W03
No ratings yet
CS460 - Deep Learning - W02 & W03
44 pages
Unit V
No ratings yet
Unit V
9 pages
Module 3
No ratings yet
Module 3
83 pages
Week 2 Artificial Neural Networks - Part II
No ratings yet
Week 2 Artificial Neural Networks - Part II
40 pages
Chapters 1-4
No ratings yet
Chapters 1-4
6 pages
Artificial Neural Network: Lecture Module 22
No ratings yet
Artificial Neural Network: Lecture Module 22
54 pages
Neural Network: Throughout The Whole Network, Rather Than at Specific Locations
No ratings yet
Neural Network: Throughout The Whole Network, Rather Than at Specific Locations
8 pages
Dense Neural Nets
No ratings yet
Dense Neural Nets
68 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
10 pages
Neural Networks for Engineers
No ratings yet
Neural Networks for Engineers
91 pages
Neural Networks Essay Feranmi Dere
No ratings yet
Neural Networks Essay Feranmi Dere
7 pages
Mod 6 DSC 204
No ratings yet
Mod 6 DSC 204
10 pages
Neural Networks
No ratings yet
Neural Networks
33 pages
CS2011 5
No ratings yet
CS2011 5
43 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
66 pages
Mathematics of Artificial Neural Networks - Wikipedia
No ratings yet
Mathematics of Artificial Neural Networks - Wikipedia
5 pages
NNs PDF
No ratings yet
NNs PDF
16 pages
Week 14 (NN)
No ratings yet
Week 14 (NN)
49 pages
Neural Net 3rdclass
No ratings yet
Neural Net 3rdclass
35 pages
Part 1.1.neural Network and Training Algorithm
No ratings yet
Part 1.1.neural Network and Training Algorithm
34 pages
ANN MODULE 1 Part2
No ratings yet
ANN MODULE 1 Part2
58 pages
Unit 2
No ratings yet
Unit 2
18 pages
NNFL Unit III For ECE & EEE
No ratings yet
NNFL Unit III For ECE & EEE
29 pages
12 AI Unit 6 Understanding Neural Networks
No ratings yet
12 AI Unit 6 Understanding Neural Networks
21 pages
Part 1.1.neural Network and Training Algorithm
No ratings yet
Part 1.1.neural Network and Training Algorithm
34 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
18 pages
ANN PG Module1
No ratings yet
ANN PG Module1
75 pages
TO Artificial Neural Networks
No ratings yet
TO Artificial Neural Networks
22 pages
@vtucode - in Module 5 AI 2021 Scheme 5th Sem
No ratings yet
@vtucode - in Module 5 AI 2021 Scheme 5th Sem
66 pages
Secure Networks (Firewall)
No ratings yet
Secure Networks (Firewall)
33 pages
TransectionManagement Part1
No ratings yet
TransectionManagement Part1
29 pages
Feasibility Analysis
No ratings yet
Feasibility Analysis
2 pages
Accountability and Auditing
No ratings yet
Accountability and Auditing
15 pages
Lecture 9
No ratings yet
Lecture 9
31 pages
Source Code
No ratings yet
Source Code
7 pages
PDC-Assignment 03 1
No ratings yet
PDC-Assignment 03 1
1 page
App File
No ratings yet
App File
1 page
Assignment 4 (21-cs-51 & 21-cs-98)
No ratings yet
Assignment 4 (21-cs-51 & 21-cs-98)
8 pages
ReportTask 1
No ratings yet
ReportTask 1
3 pages
Assignment 3
No ratings yet
Assignment 3
1 page
Excel 365 Lab Evaluation Sample
No ratings yet
Excel 365 Lab Evaluation Sample
6 pages
Muhammad Sameer Ahmed 23-CS-56 Assignment 2
No ratings yet
Muhammad Sameer Ahmed 23-CS-56 Assignment 2
3 pages
Deep Learning Unit 2
No ratings yet
Deep Learning Unit 2
4 pages
Deep Learning in Education Innovations
No ratings yet
Deep Learning in Education Innovations
10 pages
Perceptron Algorithm in Credit Analysis
No ratings yet
Perceptron Algorithm in Credit Analysis
33 pages
Ensemble Learning Explained
No ratings yet
Ensemble Learning Explained
32 pages
Naive - Bayes - Ipynb - Colab
No ratings yet
Naive - Bayes - Ipynb - Colab
3 pages
DR - Amin.ML Ch07 DeepLearning 1
No ratings yet
DR - Amin.ML Ch07 DeepLearning 1
12 pages
Stock Market Price Prediction Using LSTM RNN
No ratings yet
Stock Market Price Prediction Using LSTM RNN
11 pages
Module 1 DL
No ratings yet
Module 1 DL
6 pages
Single Layer Perceptron Classifiers
No ratings yet
Single Layer Perceptron Classifiers
52 pages
Tutorial On Neural Networks - 18MAR2024
No ratings yet
Tutorial On Neural Networks - 18MAR2024
33 pages
The (Almost) Complete Machine Learning Roadmap: Milestone 0: Python 3 and Other Basic Stuff
No ratings yet
The (Almost) Complete Machine Learning Roadmap: Milestone 0: Python 3 and Other Basic Stuff
5 pages
99 Machine Learning Algorithm
No ratings yet
99 Machine Learning Algorithm
7 pages
MATLAB Codes (CNN, LSTM)
100% (1)
MATLAB Codes (CNN, LSTM)
7 pages
LSTM Architecture Presentation
No ratings yet
LSTM Architecture Presentation
18 pages
CNN Hyperparameter Tuning Guide
No ratings yet
CNN Hyperparameter Tuning Guide
2 pages
ML Syllabus
No ratings yet
ML Syllabus
2 pages
Unit 3 AI - Neural Networks
No ratings yet
Unit 3 AI - Neural Networks
11 pages
Machine Learning: Neural Networks Slides Mostly Adapted From Tom Mithcell, Han and Kamber
No ratings yet
Machine Learning: Neural Networks Slides Mostly Adapted From Tom Mithcell, Han and Kamber
41 pages
chp3 Hebb Network
No ratings yet
chp3 Hebb Network
4 pages
An Empirical Evaluation of Generic Convolutional and Recurrent Networks For Sequence Modeling
No ratings yet
An Empirical Evaluation of Generic Convolutional and Recurrent Networks For Sequence Modeling
14 pages
Data Warehousing and Data Mining 2081
No ratings yet
Data Warehousing and Data Mining 2081
1 page
PowerPoint Presentation-3
No ratings yet
PowerPoint Presentation-3
28 pages
FaceNet Key POints
No ratings yet
FaceNet Key POints
19 pages
ML P-6 - 024
No ratings yet
ML P-6 - 024
22 pages
Neural network-WPS Office
No ratings yet
Neural network-WPS Office
23 pages
2019 6S191 L3 PDF
No ratings yet
2019 6S191 L3 PDF
71 pages
MNIST Based Handwritten Digits Recognition
No ratings yet
MNIST Based Handwritten Digits Recognition
5 pages
Chapter8-Basic Cluster Analysis2016
No ratings yet
Chapter8-Basic Cluster Analysis2016
143 pages
3 Short
No ratings yet
3 Short
10 pages
Introduction To Support Vector Machines
No ratings yet
Introduction To Support Vector Machines
23 pages

PDC Lecture 12

Uploaded by

PDC Lecture 12

Uploaded by

CS-402 Parallel and Distributed Systems

 Epochs: One full cycle through the training dataset.

 Medical Diagnosis: Assists in predicting diseases and analyzing medical images.

 Financial Forecasting: Helps in stock market predictions and risk management.

 Learning (highly) non-linear functions.

 A neural network consists of

 Update the weight by = −α . Here α is the learning rate.

 Initialize the weights =[ , ,……, ]

o Use backward propagation to compute for each weight

 An artificial neuron has two components: (1) weighted sum and

•Range: Outputs values between 0 and 1.

•Usage: Often used in binary classification problems.

• Small input values (negative) result in outputs close to 0.

• Large input values (positive) result in outputs close to 1.

• Middle input values result in outputs around 0.5.

 The derivative of the sigmoid

•Range: Outputs values between 0 and infinity.

•Usage: Commonly used in hidden layers of neural networks.

• Negative input values result in outputs of 0.

• Positive input values result in outputs equal to the input value.

•Range: Outputs values between -1 and 1.

•Usage: Often used in hidden layers of neural networks.

•Small input values (negative) result in outputs close to -1.

•Large input values (positive) result in outputs close to 1.

•Middle input values result in outputs around 0.

 Consider training data input ( =0, = 1), output Y=0.

 To update , , and b, gradient descent needs to compute , , and

 If a variable z depends on the variable y, which itself depends on the

 Assume rate = 0.1

 To update : = − ∗ = 0 – 0.1*0.5*0.25*1 = -0.0125

 To update b: b= − ∗ = 0 – 0.1*0.5*0.25*1 = -0.0125

 This process is repeated until the error is sufficiently small

 A multi-level feedforward neural network is a neural network that

 3 levels: Input level, hidden level, output level

Output layer 1 2 N2 Hidden layer biases: B2[N2]

 Compute hidden layer weighted sum: HS 1 =[ , ,…, ]

 Compute hidden layer output: HO 1 =[ , ,…, ]

 From input (IN[N0]), compute output (OO[N2]) and error E.

 Compute final output: OO 2 =[ , ,…, ]

 Let us use mean square error: = ∑ ( − )

 To goal is to compute , , , and .

 To goal is to compute , , , and .

 To goal is to compute , , , and .

 To goal is to compute , , , and .

 To goal is to compute , , , and .

 To goal is to compute , , , and .

 To goal is to compute , , , and .

 To goal is to compute , , , and .

 Once is computed, we can repeat the process for the hidden

 The output of a layer is the input of the next layer.

 Logic XOR function is not a linear function.

 Briefly discuss multi-level feedforward neural networks

 The training of neural networks

You might also like

 To update : = − ∗ = 0 – 0.10.50.25*1 = -0.0125

 To update b: b= − ∗ = 0 – 0.10.50.25*1 = -0.0125