0% found this document useful (0 votes)

78 views45 pages

An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass

This document provides an introduction to neural networks. It begins by defining machine learning and discussing common machine learning applications such as digit recognition, face recognition, and recommendation engines. It then describes different types of neural networks including feedforward neural networks, convolutional neural networks, and discusses concepts like neurons, layers, activation functions, and backpropagation for training networks. The document provides examples to help explain these concepts in a clear and concise manner.

Uploaded by

GiGa GF

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

78 views45 pages

An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass

Uploaded by

GiGa GF

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

An Introduction to Neural

Networks

Instituto Tecgraf PUC-Rio

Nome: Fernanda Duarte
Orientador: Marcelo Gattass
What is Machine Learning?
A machine learning algorithm is an algorithm that is able to learn from data.

A computer program is said to learn from experience E with respect to some class of tasks T
and performance measure P, if its performance at tasks in T, as measured by P, improves
with experience E. (Mitchell, 1997)
Applications
Digit Recognition Face Recognition Recommendation Engines

Virtual assistants (Cortana, Siri etc) Self-driving vehicles

Surveillance systems
Tasks...

Formal tasks: Playing board games, solving puzzles, mathematical and and logic
problems → Easier to code!

Expert tasks: Medical diagnosis, engineering, scheduling.

Mundane tasks: Everyday speech, written language, perception, walking, object

recognition and manipulation.
Artificial Neural Networks
Neuron: biological inspiration for computation

Neuron Artificial neuron/unit

Perceptron (Frank Rosenblatt, 1957)
- Algorithm for learning a binary classifier.
- Only capable of learning linearly separable patterns.

or
Feedforward Neural Networks
(or Deep Feedforward Networks or Multilayer
Perceptrons (MLP)) (see “Deep Learning” book, Ian Goodfellow et al.)

- Multilayer structure → “sophisticated decision making” (a unit in the second

layer can make a decision at a more complex and more abstract level than unit
in the first layer) → Learn features directly from the data.

- Nonlinearity (extends the kinds of functions that we can represent with our
neural network, e.g. XOR function (“exclusive or” ))
Feedforward Neural Networks
Goal: Approximate some function !*

Example → A classifier " = !* (#)

input #, category (label) "

A feedforward network defines a mapping " = ! (#; $) and

learn the values of the parameter $ that result in the best
function approximation.

Feedforward: information flows in one direction (input layer → output layer)

Why “network”? - Math Intuition:
- Typically represented by composing together many different functions.

Example: functions ! (1), ! (2), ! (3) connected in a chain, to form

! (") = ! (3)( ! (2)( ! (1)(")))

In this case, ! (1) is called the first hidden layer of the network, ! (2) is
called the second hidden layer, and the final layer ! (3) is called the output
layer.
- Why hidden layer? Behavior not directly specified → learning algorithm must
decide how to use those layers to form ! (") that best approximates !*.

- Length of the chain gives the depth of the model → deep learning (you can
“stack” multiple layers)
Graph representation of the network
- The feedforward network model is associated with a directed acyclic graph
(DAG) describing how the functions are composed together.
Artificial neuron
Fully-connected layers
- Neurons between two adjacent layers are fully pairwise connected, but neurons
within a single layer share no connections.
Feedforward computation

- The abstraction of a layer has the nice property that it allows us to use efficient
vectorized code (e.g. matrix multiplies).

- Think of each hidden layer as a vector, where each value represents a

neuron/unit.

- Repeated matrix multiplications interwoven with activation function.

Nonlinear!
Example of activation function: Sigmoid

Feedforward computation
Example of activation function: Sigmoid

Feedforward computation

Obs.: The output layer neurons most commonly have

a different activation function → e.g. softmax for
class scores (classification), linear functions for real-
valued target (regression), etc.
Example of activation function: Sigmoid

Feedforward computation

are the learnable parameters

of the network!
What about learning?
Optimization: find the parameters that minimizes the cost function (or loss function).

A loss function C is a measure of how wrong the model is in terms of its ability to
estimate the relationship between ! and ", i.e., " = #* (!), with the chosen
parameters. (e.g. Mean Squared Error (MSE))

Gradient Descent is a very common optimization algorithm.

Obs.: Training → Training + Validation sets

Inference → Test set
Gradient Descent
The gradient of a function gives the direction of steepest ascent
layer l-1 layer l

i j

The direction of steepest descent is the negative gradient.

Parameters update
during learning
process.
! is the learning rate
(hyperparameter)
Gradient Descent
Backpropagation
How to compute the gradient of the cost function with respect to the weights w and biases b
(efficiently)?

The backprop algorithm gives us detailed insights into how changing the weights and biases
changes the overall behaviour of the network.

Propagate the error backwards.

Main idea: use chain rule to compute the gradients.

Example:
Backpropagation
Main idea: use chain rule to compute the gradients.
Example:
Backpropagation
(Cost function of learning
Main idea: use chain rule to compute the gradients. example i)
Example:
Backpropagation
Main idea: use chain rule to compute the gradients.
Example:
Backpropagation
Main idea: use chain rule to compute the gradients.
Backpropagation

See: http://neuralnetworksanddeeplearning.com/chap2.html#the_backpropagation_algorithm
Learning process (summary)
For each learning example i in training set:

1 - Feedforward computation;

2 - Backpropagation;

3 - Weight update.
- Learning rate (!)

- Regularization (prevent overfitting)

- Epoch

- Activation function alternatives (ReLU, tanh, etc)

- Stochastic Gradient Descent (SGD) → Minibatch

Convolutional Neural Networks (CNN)

(AlexNet)
Convolutional Neural Networks (CNN or ConvNets)
- Useful when the proximity between two data points indicates how related they are (e.g.: pixels in
images!) → CNNs preserves spatial structure.

- The neurons in a layer will only be connected to a small region of the layer before it, instead of all
of the neurons in a fully-connected manner → less parameters!

- Convolutional Neural Networks take advantage of the fact that the input consists of images and
they constrain the architecture in a more sensible way. In particular, unlike a regular Neural
Network, the layers of a ConvNet have neurons arranged in 3 dimensions: width, height, depth.
Convolutional Neural Networks (CNN or ConvNets)
Every layer of a ConvNet transforms one volume of activations to another through a differentiable
function.

We use three main types of layers to build a basic CNN architecture: Convolutional Layer, Pooling Layer,
and Fully-Connected Layer.

Convolution Pooling Convolution Pooling Fully Fully Output predictions

Connected Connected
Convolutional Layer
The CONV layer’s parameters consist of a set of learnable filters → Every filter is small spatially (along
width and height), but extends through the full depth of the input volume (e.g. 5x5x3 for images with 3
color channels).
depth column

Convolutional Layer

Forward pass: We slide (or convolve) each filter across the input volume and compute dot products
between the entries of the filter and the input element by element, followed by an nonlinear activation
function elementwise.

Every filter produce a 2-dimensional activation map. (e.g. if we use 12 filters of dimensions 5x5x3, the
conv layer may have an output volume of 32x32x12, i.e., 12 activation maps with dimensions 32x32)

Intuitively, the network will learn filters that activate when they see some type of visual feature, such as
an edge of some orientation, for example.
zero-padding
Pooling Layer
Its function is to progressively reduce the spatial size of the representation to reduce the amount of
parameters and computation in the network, and hence to also control overfitting.
Fully-Connected Layer

Same as before.

Used to output the final classification scores.

https://cs.stanford.edu/people/karpathy/convnetjs/demo/cifar10.html
CNN for Real-Time Object Detection using
YOLO (“You Only Look Once)
Trabalho
Construir uma rede neural feedforward com (no mínimo) duas camadas escondidas (fully-
connected) para realizar reconhecimento automático de dígitos escritos à mão, utilizando a base
de dados MNIST.

http://yann.lecun.com/exdb/mnist/

(A descrição do trabalho será enviada por e-mail)

References
https://www.kdnuggets.com/2018/02/8-neural-network-architectures-machine-learning-researchers-need-learn.html

https://www.youtube.com/watch?v=d14TUNcbn1k

http://neuralnetworksanddeeplearning.com/chap1.html

http://www.deeplearningbook.org/

https://www.youtube.com/watch?v=1L0TKZQcUtA

https://towardsdatascience.com/machine-learning-fundamentals-via-linear-regression-41a5d11f5220
References
Backpropagation:

https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/

https://ayearofai.com/rohan-lenny-1-neural-networks-the-backpropagation-algorithm-explained-abf4609d4f9d

http://neuralnetworksanddeeplearning.com/chap2.html

https://www.youtube.com/watch?v=tIeHLnjs5U8
References
CNNs:

https://hashrocket.com/blog/posts/a-friendly-introduction-to-convolutional-neural-networks

https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/

http://web.stanford.edu/class/cs231a/lectures/intro_cnn.pdf

http://cs231n.stanford.edu/

Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
21 pages
Lecture 1
No ratings yet
Lecture 1
38 pages
Chapter 4 Neural Network
No ratings yet
Chapter 4 Neural Network
46 pages
Intro to Feed Forward Neural Networks
No ratings yet
Intro to Feed Forward Neural Networks
41 pages
Notes DL-1
No ratings yet
Notes DL-1
10 pages
Artificial Intelligence: Outline
No ratings yet
Artificial Intelligence: Outline
35 pages
Chapter-4 Fundamental of Neural Network
No ratings yet
Chapter-4 Fundamental of Neural Network
26 pages
Neural Network Oxygen
No ratings yet
Neural Network Oxygen
25 pages
Unit-1 NN
No ratings yet
Unit-1 NN
12 pages
Types of Neural Networks and Definition of Neural Network
No ratings yet
Types of Neural Networks and Definition of Neural Network
15 pages
A Beginner's Tutorial For CNN
100% (1)
A Beginner's Tutorial For CNN
35 pages
Deep Learning UNIT 1
No ratings yet
Deep Learning UNIT 1
22 pages
Unit 1
No ratings yet
Unit 1
20 pages
Unit 5 ML
No ratings yet
Unit 5 ML
37 pages
Unit III
No ratings yet
Unit III
29 pages
Neural Network
No ratings yet
Neural Network
7 pages
Neural Networks: Feedforward Basics
No ratings yet
Neural Networks: Feedforward Basics
24 pages
Module 2
No ratings yet
Module 2
44 pages
Understanding Multi-Layer Feed-Forward Neural Networks in Machine Learning
No ratings yet
Understanding Multi-Layer Feed-Forward Neural Networks in Machine Learning
4 pages
Unit 3 Endsem PYQs
No ratings yet
Unit 3 Endsem PYQs
19 pages
Int254 Unit 3
No ratings yet
Int254 Unit 3
29 pages
Unit 1
No ratings yet
Unit 1
19 pages
Unit 1 GEN AI
No ratings yet
Unit 1 GEN AI
61 pages
Unit 1
No ratings yet
Unit 1
70 pages
Neural Networks
No ratings yet
Neural Networks
28 pages
3 - DeepLearning - and - CNN v3
No ratings yet
3 - DeepLearning - and - CNN v3
50 pages
CNN and Gan: Introduction To
No ratings yet
CNN and Gan: Introduction To
58 pages
ML-5TH Unit
No ratings yet
ML-5TH Unit
28 pages
Deep Learning
No ratings yet
Deep Learning
90 pages
ML Unit-5
No ratings yet
ML Unit-5
22 pages
NNDL
No ratings yet
NNDL
96 pages
OE Unit 5
No ratings yet
OE Unit 5
80 pages
Deep Learning (Handout)
No ratings yet
Deep Learning (Handout)
11 pages
Feed Forward Neural Network
No ratings yet
Feed Forward Neural Network
16 pages
Unit - 4
No ratings yet
Unit - 4
17 pages
Unit 3
No ratings yet
Unit 3
8 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
37 pages
ECSE484 Intro v2
No ratings yet
ECSE484 Intro v2
67 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
AI Mod4 Session 8 Best Fit Line & ANN
No ratings yet
AI Mod4 Session 8 Best Fit Line & ANN
39 pages
Chapter21 4e
No ratings yet
Chapter21 4e
35 pages
Basics
No ratings yet
Basics
48 pages
Lec14 CNNRNNModels
No ratings yet
Lec14 CNNRNNModels
64 pages
Unit 1
No ratings yet
Unit 1
29 pages
What Is A Neural Network? - IBM
No ratings yet
What Is A Neural Network? - IBM
10 pages
Chapter 2 - Artificial Neural Networks
No ratings yet
Chapter 2 - Artificial Neural Networks
19 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
COMP3411 Week 3 - NN
No ratings yet
COMP3411 Week 3 - NN
70 pages
Unit 1
No ratings yet
Unit 1
16 pages
19 - Introduction To Neural Networks
No ratings yet
19 - Introduction To Neural Networks
7 pages
ML 6
No ratings yet
ML 6
10 pages
4.0 The Complete Guide To Artificial Neural Networks
0% (1)
4.0 The Complete Guide To Artificial Neural Networks
23 pages
Deep Learning Cheatsheet Guide
No ratings yet
Deep Learning Cheatsheet Guide
14 pages
Machine Learning
No ratings yet
Machine Learning
11 pages
10 Neural Nets
No ratings yet
10 Neural Nets
61 pages
Unit.1.Introduction To Deep Learning
No ratings yet
Unit.1.Introduction To Deep Learning
10 pages
Unit V
No ratings yet
Unit V
49 pages
A Deep Convolutional Neural Network Learning Transfer To SVM-Based Segmentation Method For Brain Tumor
No ratings yet
A Deep Convolutional Neural Network Learning Transfer To SVM-Based Segmentation Method For Brain Tumor
5 pages
Ventilador - Siemens Servo Screen 390 - Service Manual
0% (1)
Ventilador - Siemens Servo Screen 390 - Service Manual
49 pages
Design Engineering GTU
No ratings yet
Design Engineering GTU
16 pages
Three Address Codes
No ratings yet
Three Address Codes
5 pages
VLSI DFT Training Insights
No ratings yet
VLSI DFT Training Insights
2 pages
Smart Home Review Preprint
No ratings yet
Smart Home Review Preprint
16 pages
IMADA Catalog Contact Force Measurement
No ratings yet
IMADA Catalog Contact Force Measurement
16 pages
March 2025 Lept Exam Program (Prof Teachers) Revised March 2025
No ratings yet
March 2025 Lept Exam Program (Prof Teachers) Revised March 2025
2 pages
Linux Certification Essentials
No ratings yet
Linux Certification Essentials
150 pages
Bend Shaft Vibration Spectrum
No ratings yet
Bend Shaft Vibration Spectrum
6 pages
Project Report - 2 On CreditCard Fraud Detection
No ratings yet
Project Report - 2 On CreditCard Fraud Detection
42 pages
Entc 3320: 555 Timer
No ratings yet
Entc 3320: 555 Timer
20 pages
KF Data First Name Only 1to50001
No ratings yet
KF Data First Name Only 1to50001
1,163 pages
(Ebook PDF) Business Driven Information Systems 6 Edition by Paige Baltzan Download
100% (2)
(Ebook PDF) Business Driven Information Systems 6 Edition by Paige Baltzan Download
50 pages
Cabling Guide For Console and AUX Ports - Cisco
No ratings yet
Cabling Guide For Console and AUX Ports - Cisco
12 pages
Canoga 9145
No ratings yet
Canoga 9145
2 pages
Pydroid
No ratings yet
Pydroid
3 pages
Designing The Modules: This Lecture Is Based On The Chapter 6 of The Book "Software Engineering: Theory and Practice"
No ratings yet
Designing The Modules: This Lecture Is Based On The Chapter 6 of The Book "Software Engineering: Theory and Practice"
100 pages
LESSON 10 - Functions
No ratings yet
LESSON 10 - Functions
6 pages
Travel Email
No ratings yet
Travel Email
2 pages
Using Database Partitioning With Oracle E-Business Suite (Doc ID 554539.1)
No ratings yet
Using Database Partitioning With Oracle E-Business Suite (Doc ID 554539.1)
39 pages
Fybcom Sem 1 Commerce 1
No ratings yet
Fybcom Sem 1 Commerce 1
20 pages
Advanced Driver Assistance Systems
No ratings yet
Advanced Driver Assistance Systems
11 pages
DIALux evo: Office Lighting Design
0% (1)
DIALux evo: Office Lighting Design
51 pages
Control Builder Components Reference EXDOC-XX15-en-110 PDF
0% (1)
Control Builder Components Reference EXDOC-XX15-en-110 PDF
254 pages
Project Cost Management Template
100% (3)
Project Cost Management Template
8 pages
Citrix Easycall Gateway Telephony System Integrator'S Guide: For Alcatel Omnipcx Enterprise
No ratings yet
Citrix Easycall Gateway Telephony System Integrator'S Guide: For Alcatel Omnipcx Enterprise
16 pages
Chapter 1 Complex Numbers: Question Bank
No ratings yet
Chapter 1 Complex Numbers: Question Bank
4 pages
2025 Copy of Fintech Eng m12345 Part 2
No ratings yet
2025 Copy of Fintech Eng m12345 Part 2
345 pages
M2350-1 Windows Interface Ver1.2.1 April 2013
No ratings yet
M2350-1 Windows Interface Ver1.2.1 April 2013
12 pages

An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass

Uploaded by

An Introduction To Neural Networks: Instituto Tecgraf PUC-Rio Nome: Fernanda Duarte Orientador: Marcelo Gattass

Uploaded by

An Introduction to Neural

Instituto Tecgraf PUC-Rio

Virtual assistants (Cortana, Siri etc) Self-driving vehicles

Expert tasks: Medical diagnosis, engineering, scheduling.

Mundane tasks: Everyday speech, written language, perception, walking, object

Neuron Artificial neuron/unit

- Multilayer structure → “sophisticated decision making” (a unit in the second

Example → A classifier " = !* (#)

input #, category (label) "

A feedforward network defines a mapping " = ! (#; $) and

Feedforward: information flows in one direction (input layer → output layer)

Example: functions ! (1), ! (2), ! (3) connected in a chain, to form

! (") = ! (3)( ! (2)( ! (1)(")))

- Think of each hidden layer as a vector, where each value represents a

- Repeated matrix multiplications interwoven with activation function.

Obs.: The output layer neurons most commonly have

are the learnable parameters

Gradient Descent is a very common optimization algorithm.

Obs.: Training → Training + Validation sets

The direction of steepest descent is the negative gradient.

Propagate the error backwards.

Main idea: use chain rule to compute the gradients.

- Regularization (prevent overfitting)

- Activation function alternatives (ReLU, tanh, etc)

- Stochastic Gradient Descent (SGD) → Minibatch

Convolution Pooling Convolution Pooling Fully Fully Output predictions

Used to output the final classification scores.

(A descrição do trabalho será enviada por e-mail)

You might also like