Motivation CNN vs.
FCN Convolution Backpropagation References
Machine Learning (CE 40717)
Fall 2024
Ali Sharifi-Zarchi
CE Department
Sharif University of Technology
January 5, 2025
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 1 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
1 Motivation
2 CNN vs. FCN
3 Convolution
4 Backpropagation
5 References
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 2 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
1 Motivation
2 CNN vs. FCN
3 Convolution
4 Backpropagation
5 References
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 3 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
How Do Humans See?
How can we enable computers to see?
Figure adapted from [4]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 4 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
What Computers See: Images As Numbers
An image is just a matrix of numbers .i.e., 1080 × 1080 × 3 for an RGB image.
Question: is this Lincoln? Washington? Jefferson? Obama?
How can the computer answer this question?
Figure adapted from [4]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 5 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
What Is Computer Vision?
Figure adapte
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 6 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
What Is Computer Vision?
Figure a
Noisy image Denoised image
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 7 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
What Is Computer Vision?
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 8 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Hand-crafted Features: Before Deep Learning Features
Two types of features used for action recognition.
Figure adapted from source & source
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 9 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Hand-crafted Features: Before Deep Learning Features
Figure ada
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 10 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Image Features
Fusion of multi-scale bag of deep visual words features of chest X-ray images to detect
COVID-19 infection
Figure adapted from source
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 11 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Effect Of DL
Comparison between Deep Learning and Traditional Models
Figure adapted from so
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 12 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
1 Motivation
2 CNN vs. FCN
3 Convolution
4 Backpropagation
5 References
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 13 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
FCNs On Images
• What we’ve been using?
• Dense Vector Multiplication
Figure adapted from [3]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 14 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Fully Connected Layers
• Neurons in a single layer operate independently and do not share connections,
even for inputs.
• To be shift-invariant, many samples (in various time or locations) must be shown
to them.
• Regular Neural Nets don’t scale well to full images.
• Parameters would add up quickly!
• Full connectivity is wasteful and the huge number of parameters would quickly lead
to overfitting.
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 15 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Solution?
What can we do?
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 16 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Invariance VS. Equivariance
• Invariance: The output remains the same when the input undergoes a transformation.
• Equivariance: The output varies predictably when the input undergoes a
transformation.
Figure adapted from source
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 17 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Shift-invariance
• Cat detector f : Rd → R
• Shift operator Sv : Rd → Rd shifting the image by vector v
Shift invariance: f (x) = f (Sv x)
Question: Will an MLP that recognizes the left image as a cat also recognize the
shifted image on the right as a cat?
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 18 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
A Problem
• In many problems the location of a pattern is not important.
• Only the presence of the pattern matters.
• Traditional MLPs are sensitive to pattern location.
• Moving it by one component results in an entirely different input that the MLP
won’t recognize.
• Requirement: Network must be shift invariant.
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 19 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution On Images
Convolution (Refresher)
Matrix input preserving spatial structure
Figure adapted from [3]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 20 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Fully Connected Or Convolution
Question: Can I just do classification on an flatten image (e.g., a 1080x1080x3 image)
with fully connected layers?
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 21 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Fully Connected Or Convolution
Question: Can I just do classification on an flatten image (e.g., a 1080x1080x3 image)
with fully connected layers?
Answer: No, using fully connected layers on such a large image is inefficient due to:
• Loss of spatial structure
• High parameter count
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 22 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Fully Connected Or Convolution
Why Use Convolution:
• Exploit spatial structure: Convolution sees local patterns by using filters over
small patches of the image.
• Reduce parameters: By sharing weights across the image, convolution reduce the
number of parameters.
• Build hierarchical features: Convolution starts with small patterns, then
combines them to recognize more complex patters.
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 23 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Figure adapted from source
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 24 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
What Is CNN?
• It is a class of deep learning.
• Convolutional neural network (ConvNet’s or CNNs) is one of the main categories
to do image recognition, image classifications, object detection, recognition of
faces, etc.
• It is similar to the basic neural network. CNNs also have learnable parameters
like weights and biases, similar to neural networks.
• CNN is heavily used in computer vision.
• There are 3 basic components to define CNN:
• The Convolution Layer
• The Pooling Layer
• The Output Layer or Fully Connected Layer
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 25 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Architecture Of CNN
The basic idea of Convolutional Neural Networks (CNNs) is similar to Backpropagation
Neural Networks (BPNNs) but differs in implementation.
Figure adapted from source
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 26 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
The Basic Structure
• Alternating Convolution (C) and subsampling layer (S)
• Subsampling allows flexible positioning of features.
Figure adapted from source
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 27 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
The Basic Structure
Three Main Types of Layers
• Convolutional Layer
• Output of neurons are connected to local regions in the input.
• Applying the same filter across the entire image.
• CONV layer’s parameters consist of a set of learnable filters.
• Pooling Layer
• Performs a downsampling operation along the spatial dimensions.
• Fully-Connected Layer
• Typically used in the final stages of the network to combine high-level features and
make predictions.
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 28 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
CNN VS. FCN
• Fully Connected Networks (FCNs):
• High number of parameters, leading to overfitting on large inputs (e.g., images).
• Lack of spatial awareness, making it sensitive to the exact positioning of patterns.
• Suitable for structured data but inefficient for image processing.
• Convolutional Neural Networks (CNN):
• Uses filters to process only small parts of the input at a time (locality).
• Weight sharing and local connectivity reduce the number of parameters.
• Can recognize patterns independent of their location within the image (shift
invariance).
• Efficient for image, video, and speech data.
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 29 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
1 Motivation
2 CNN vs. FCN
3 Convolution
4 Backpropagation
5 References
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 30 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
What Is A Convolve
Scanning an image with a "filter"
• A filter is really just a perceptron, with weights and a bias.
• At each location, the filter is multiplied component-wise with the underlying
map values, and the products are summed Figure
along adapted from [3]
with the bias.
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 31 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Weights Showing Correlation
• The weights of the filter represent the appearance of the number "2".
• The green has a higher correlation with the filter compared to the red.
• The green pattern is more likely to represent
Figure
theadapted
number from [1]
"2".
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 32 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
What Is Convolution
• Convolve: Slide over the image spatially, computing dot products.
• This allows us to preserve the spatial structure of thefrom
Figure adapted input.
[2]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 33 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
What Is The Output
• It’s simply a neuron with local connectivity!Figure adapted from [2]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 34 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution Process
• Consider this as the filter we are going to use:
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 35 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution Process
• This is how we calculate the convolutional layer’s output:
kX
h −1 kw
X −1
Convolved Feature(i , j) = (I ∗ K )(i , j) = I (i + a, j + b)K (a, b)
a=0 b =0
I: Input Image
K: Our Kernel
kh and kw : The height and width of the Kernel
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 36 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution Process
• Convoluting a 5x5x1 image with a 3x3x1 kernel to get a 3x3x1 convolved feature.
Figure adapted from [3]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 37 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution Process
• Convoluting a 5x5x1 image with a 3x3x1 kernel to get a 3x3x1 convolved feature.
Figure adapted from [3]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 38 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution Process
• Convoluting a 5x5x1 image with a 3x3x1 kernel to get a 3x3x1 convolved feature.
Figure adapted from [3]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 39 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution Process
• Convoluting a 5x5x1 image with a 3x3x1 kernel to get a 3x3x1 convolved feature.
Figure adapted from [3]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 40 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution Process
• Convoluting a 5x5x1 image with a 3x3x1 kernel to get a 3x3x1 convolved feature.
Figure adapted from [3]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 41 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution Process
• Convoluting a 5x5x1 image with a 3x3x1 kernel to get a 3x3x1 convolved feature.
Figure adapted from [3]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 42 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution Process
• Convoluting a 5x5x1 image with a 3x3x1 kernel to get a 3x3x1 convolved feature.
Figure adapted from [3]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 43 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution Process
• Convoluting a 5x5x1 image with a 3x3x1 kernel to get a 3x3x1 convolved feature.
Figure adapted from [3]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 44 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution Process
• Convoluting a 5x5x1 image with a 3x3x1 kernel to get a 3x3x1 convolved feature.
Figure adapted from [3]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 45 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
To Summarize
• We apply the same filter on different regions of the input.
• Convolutional filters learn to make decisions based on local spatial input, which
is an important attribute when working with images.
• Uses a lot fewer parameters compared to Fully Connected Layers.
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 46 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution Outlook
• If we consider the image to be of size n × m × b and the filter to be of size n′ × m′ × b, we
will have an activation map of size (n − n′ + 1)Figure − m′ + 1)
× (m adapted × 1[2]
from for each filter.
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 47 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution Outlook
• Multiple filters can be applied, each with its own bias, to extract different features from
Figure adapted from [2]
the input.
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 48 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
What Is Stride
• The scans of the individual filters may advance by more than one pixel at a
time.
• The stride can be greater than 1.
• Effectively increasing the granularity of the scan.
• Saves computation, sometimes at the risk of losing information.
• This results in a reduction in the size of the resulting maps.
• They will shrink by a factor equal to the stride.
• This can happen at any layer. Figure adapted from [3]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 49 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Stride-1
Closer look
• 7x7 input with 3x3 filter
• This was a Stride 1 filter
• => Outputs 5x5
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 50 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Strides-2
Now let’s use Stride 2
• 7x7 input with 3x3 filter
• This was a Stride 2 filter
• => Outputs 3x3
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 51 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Strides-3
What about Stride 3?
• 7x7 input with 3x3 filter
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 52 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Strides Formula
(N −F )
OutputSize = Stride + 1
N = 7, F = 3 ⇒
• Stride 1: (7 − 3)/1 + 1 = 5
• Stride 2: (7 − 3)/2 + 1 = 3
• Stride 3: (7 − 3)/3 + 1 = 2.33 !!!
Figure adapted from [2]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 53 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
A problem?
What could cause problems?
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 54 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Problem-1
• The borders don’t get enough attention.
• Outputs shrink!
Pixel utilization for convolutions of size 1 × 1, 2 × 2, and 3 × 3 respectively.
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 55 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Problem-2
• Borders don’t get enough attention.
• Outputs shrink!
32x32 input shrinks to 28x28 output.
(information loss occurs)
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 56 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Solution?
What is the solution?
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 57 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Padding
Padding: the secret ingredient to keep strides in line!
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 58 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Padding
In practice, it is common to zero-pad the border.
Recall:The original formula for convolution without
padding:
N −F
+1
stride
Now, we should adjust the formula considering padding P:
N + 2P − F
+1
stride
Figure adapted from [2]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 59 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Padding
• Zero-padding is used not only for stride > 1, but also to prevent a reduction in
output size even when S = 1.
• For stride > 1, zero padding is adjusted to ensure that the size of the convolved
output is: NS
§ ¨
This is achieved by zero padding the image with: P = S NS − N
§ ¨
• For an F width filter:
• Odd F : Pad on both left and right with F −1 columns of zeros.
2
• Even F : Pad one side with F columns of zeros, and the other with F − 1 columns of
2 2
zeros.
• The resulting image is width: N + F − 1
• The top & bottom zero padding follows the same rules to maintain map height
after convolution.
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 60 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convnet
• ConvNet: a sequence of convolution layers interspersed with activation
functions. Figure adapted from [2]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 61 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convnet
Question: What do convolutional filters at different levels of a ConvNet learn?
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 62 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convnet
Question: What do convolutional filters at different levels of a ConvNet learn?
Answer:
• Filters in the early layers typically detect simple features such as edges, textures,
and basic shapes.
• As we move deeper into the network, the filters learn more complex and
abstract features, such as specific parts of objects (e.g., a nose, eyes, or other
high-level patterns).
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 63 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution Output
• Accepts a volume of size W1 × H1 × D1
• Requires four hyper-parameters:
• Number of filters K
• Their spatial extent F
• The stride S
• The amount of zero padding P
• Produces a volume of size W2 × H2 × D2 where:
³ ´
• W2 = W1 −F +2P + 1
³ S ´
• H2 = H 1 −F +2P
S +1
• D2 = K
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 64 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Parameter Setting
Common settings:
• K = powers of 2 (e.g., 32, 64, . . . )
• With parameter sharing, convolution • F = 3, S = 1, P = 1
introduces F · F · D1 weights per filter,
• F = 5, S = 1, P = 2
for a total of (F · F · D1 ) · K weights
and K biases. • F = 5, S = 2, P = adjusted
accordingly.
• F = 1, S = 1, P = 0
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 65 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Example
Question: Given an input volume of 32x32x3, we apply 10 filters, each of size 5x5,
with a stride of 1 and padding of 2.
• What is the output volume size of this convolutional layer?
• How many parameters are required for this convolutional layer?
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 66 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Example
Output Volume Size:
32 + 2 × 2 − 5
µ ¶
+ 1 = 32 (spatial dimensions), 32 × 32×10
1
Number of Parameters:
Each filter parameters: 5 × 5 × 3 + 1 = 76 (+1 for bias).
Therefore, total parameters: 76 × 10 = 760.
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 67 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
1 Motivation
2 CNN vs. FCN
3 Convolution
4 Backpropagation
5 References
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 68 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Example Overview
• we will tackle this problem with an example using stride = 2 and padding = 0.
Figures adapted from [7]
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 69 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution
Recall: convolution process
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 70 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 71 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 72 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 73 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Convolution
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 74 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Another view
z1 = w1 × a1 + w2 × a2 + w3 × a3 + w4 × a4 + + · · · + w9 × a13
z2 = w1 × a3 + w2 × a4 + w3 × a5 + w4 × a8 + · · · + w9 × a15
z3 = w1 × a11 + w2 × a12 + w3 × a13 + w4 × a16 + · · · + w9 × a23
z4 = w1 × a13 + w2 × a14 + w3 × a15 + w4 × a18 + · · · + w9 × a25
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 75 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Whole Network
• For easier computation, we will assume one layer of convolution and a perceptron
as our whole network.
• Recall: w ∗ = wi − α × ∂∂wL
i i
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 76 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Gradient
• We can easily calculate the gradients:
∂L ∂z1 ∂L ∂z2 ∂L ∂z3 ∂L ∂z4 ∂L
= + + +
∂w1 ∂w1 ∂z1 ∂w1 ∂z2 ∂w1 ∂z3 ∂w1 ∂z4
∂L ∂L ∂L ∂L
= a1 + a3 + a11 + a13
∂z1 ∂z2 ∂z3 ∂z4
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 77 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Gradient
Do the same for each w
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 78 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
∂L ∂L ∂L ∂L ∂L
= a1 + a3 + a11 + a13
∂w1 ∂z1 ∂z2 ∂z3 ∂z4
∂L ∂L ∂L ∂L ∂L
= a2 + a4 + a12 + a14
∂w2 ∂z1 ∂z2 ∂z3 ∂z4
∂L ∂L ∂L ∂L ∂L
= a3 + a5 + a13 + a15
∂w3 ∂z1 ∂z2 ∂z3 ∂z4
∂L ∂L ∂L ∂L ∂L
= a6 + a8 + a16 + a18
∂w4 ∂z1 ∂z2 ∂z3 ∂z4
∂L ∂L ∂L ∂L ∂L
= a7 + a9 + a17 + a19
∂w5 ∂z1 ∂z2 ∂z3 ∂z4
...
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 79 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Overview
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 80 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Result
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 81 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
What’s next?
Proceed with gradient descent.
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 82 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
So Far
• The highly structured nature of image data.
• Local correlations are important.
• CNNs have local and shared connections.
• Strides & padding.
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 83 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Up Next
Learn more details about CNNs!
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 84 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
1 Motivation
2 CNN vs. FCN
3 Convolution
4 Backpropagation
5 References
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 85 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Contributions
• This slide has been prepared thanks to:
• Ali Aghayari
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 86 / 87
Motivation CNN vs. FCN Convolution Backpropagation References
Ali Sharifi-Zarchi (Sharif University of
Technology) Machine Learning (CE 40717) January 5, 2025 87 / 87