0% found this document useful (0 votes)

51 views16 pages

Module 2

Uploaded by

chinnu.200420

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views16 pages

Module 2

Uploaded by

chinnu.200420

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

BAI701

Module-2

Chapter 1: Basics of Supervised Deep Learning

2.1 Introduction
The use of supervised and unsupervised deep learning models has grown at a fast rate due to their
success with learning of complex problems. High-performance computing resources, availability
of huge amounts of data (labeled and unlabeled) and state-of-the-art open-source libraries are
making deep learning more and more feasible for various applications.

2.2 Convolutional Neural Network (ConvNet/CNN)

Convolutional Neural Network also known as ConvNet or CNN is a deep learning technique that
consists of multiple numbers of layers. ConvNets are inspired by the biological visual cortex.
ConvNets have shown excellent performance on several applications such as image classification,
object detection, speech recognition, natural language process ing, and medical image analysis.
Convolutional neural networks are powering core of computer vision that has many applications
which include self-driving cars, robotics, and treatments for the visually impaired. The main
concept of ConvNets is to obtain local features from input (usually an image) at higher layers and
combine them into more complexfeatures at the lower layers. However, due toits multilayered
architecture, it is computationally exorbitant and training such networks on a large dataset takes
several days. Therefore, such deep networks are usually trained on GPUs. Convolutional neural
networks are so powerful on visual tasks that they outperform almost all the conventional methods.

2.3 Evolution of Convolutional Neural Network Models

LeNet

 The first practical convolutional neural network (CNN), designed to classify handwritten
digits (MNIST).
 Used backpropagation for training and was adopted for reading handwritten checks.
 Did not scale well to larger problems due to:
o Small labeled datasets

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 1

BAI701
o Slow computers
o Use of unsuitable activation functions (like sigmoid/tanh) leading to vanishing
gradients, which make training deep networks difficult.

AlexNet

 Achieved the first major breakthrough in 2012 by winning the ImageNet Large-Scale
Visual Recognition Challenge (ILSVRC).
 Reduced classification error rate from 26% to 15%.
 Improvements over LeNet includes:
o Large labeled image database (ImageNet), which contained around 15 million
labeled images from a total of over 22,000 categories, was used.
o The model was trained on high-speed GTX 580 GPUs for 5 to 6 days.
o Use of ReLU activation function (f(x) = max(x, 0)), which is faster and avoids
vanishing gradient problems.
 Architecture: 5 convolutional layers, 3 pooling layers, 3 fully connected layers, and a
1000-way softmax classifier.

ZFNet (2013):

o An improved version of CNN architecture by reducing the first-layer filter size

from 11×11 to 7×7 and stride from 4 to 2.
o This led to better feature extraction and fewer dead features.
o ZFNet won the ILSVRC 2013 competition.

VGGNet (2014):

o The depth of the network was made 19 layers by adding more convolutional
layers with 3 × 3 filters, along with 2 × 2 max-pooling layers with stride and
padding of 1 in all layers.
o The deeper, simpler architecture improved accuracy significantly.
o VGGNet achieved 7.32% error rate and was the runner-up in ILSVRC 2014.

GoogLeNet (2015):

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 2

BAI701
Google developed a ConvNet model called GoogLeNet in 2015. It uses an inception module which
helps in reducing the number of parameters in the network. The inception module is actually a
concatenated layer of convolutions (3 × 3 and 5 ×5convolutions) and pooling sub-layers at
different scales with their output filter banks concatenated into a single output vector making the
input for the succeeding stage. These sub-layers are not stacked sequentially but the sub-layers are
connected in parallel as shown in Fig. 2.1.

In order to compensate for additional computational complexity due to extra convolutional

operations, 1 × 1 convolution is used that results in reduced computations before expensive 3 × 3
and 5 × 5 convolutions are performed. GoogLeNet model has two convolutional layers, four max-
pooling layers, nine inception layers, and a softmax layer. The use of this special inception
architecture makes GoogLeNet to have 12 times lesser parameters than AlexNet.

Increasing network layers can improve accuracy by learning more features, but has limits:

1. Vanishing gradients: Very deep networks may lose important information during training.
2. Optimization difficulty: Too many parameters make training harder.

To address this, network depth should be increased carefully.

GoogLeNet won ILSVRC 2015 with a 6.7% error rate.

Later versions include Inception V3 (2016) and Inception-ResNet (2017).

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 3

BAI701
ResNet:

Microsoft Research Asia proposed a CNN architecture in 2015, which is, 152 layers deep and is
called ResNet. ResNet introduced residual connections in which the output of a conv-relu-conv
series is added to the original input and then passed through Rectified Linear Unit (ReLU) as
shown in Fig. 2.2. In this way, the information is carried from the previous layer to the next layer
and during backpropagation, the gradient flows easily because of the addition operations, which
distributes the gradient. ResNet proved that a complex architecture like Inception is not required
to achieve the best results but a simple and deep architecture can be tweaked to get better results.
Performed exceptionally well, winning ILSVRC 2015 with a 3.6% error rate, surpassing human-
level accuracy. Despite its depth, ResNet had fewer parameters than VGGNet.

Inception-ResNet (2017):

 Combined the Inception module with residual connections to form a hybrid model.
 This design significantly increased training speed.
 It slightly outperformed ResNet in terms of accuracy.

Xception:

A convolutional neural network architecture based on depthwise separable convolution layers

is called Xception. The architecture is actually inspired by inception model and that is why it
is called Xception (Extreme Inception). Xception architecture is a pile of depthwise separable

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 4

BAI701
convolution layers with residual connections. Xception has 36 convolutional layers organized
into 14 modules, all having linear residual connections around them, except for the first and
last modules. The Xception has claimed to perform slightly better than Inception V3 on
ImageNet. Table 2.1 and Fig. 2.3 show classification performance of VGG-16, ResNet-152,
Inception V3 and Xception on ImageNet.

SqueezeNet: Researchers developed SqueezeNet to reduce the size and complexity of

convolutional neural networks without sacrificing accuracy. The approach included pruning
small-weight parameters to create sparse models and retraining them. Additionally, SqueezeNet
adopted three main strategies to minimize parameters and computation:

 (a) Replacing 3 × 3 filters with 1 × 1 filters.

 (b) Reducing the number of input channels to 3 × 3 filters.

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 5

BAI701
 (c) Delaying subsampling to later layers to preserve larger activation maps.

With these methods, SqueezeNet achieved AlexNet-level accuracy on ImageNet using 50 times
fewer parameters.

ShuffleNet: Another ConvNet architecture called ShuffleNet was introduced in 2017 for devices
with limited computational power, like mobile devices, without compromising on accuracy.
ShuffleNet used two ideas, pointwise group convolution and channel shuffle, to considerably
decrease the computational cost while main taining the accuracy.

2.4 Convolution Operation

Convolution is a mathematical operation performed on two functions and is written as (f * g),

where f and g are two functions. The output of the convolution operation for domain n is defined
as

For time-domain functions, n is replaced by t. The convolution operation is com mutative in

nature, so it can also be written as

Convolution operation is one of the important operations used in digital signal processing and is
used in many areas which includes statistics, probability, natural language processing, computer
vision, and image processing. It can be applied to a two-dimensional function by sliding one
function on top of another, multiplying and adding. Convolution operation can be applied to
images to perform various transformations; here, images are treated as two-dimensional functions.
An example of a two-dimensional filter, a two-dimensional input, and a two-dimensional feature
map is shown in Fig. 2.4. Let the 2D input (i.e., 2D image) be denoted by A, the 2Dfilter of size

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 6

BAI701
m ×n be denoted by K, and the 2D feature map be denoted by F. Here, the image A is convolved
with the filter K and produces the feature map F. This convolution operation is denoted by A*K
and is mathematically given as

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 7

BAI701
2.5 Architecture of CNN

Traditional Neural Network Limitations

 Fully connected layers connect every neuron in one layer to every neuron in the previous
layer.
 This dense connectivity does not scale well to large images.

Need for CNN

 CNNs are better for large images and data with grid-like structure (e.g., 1D time-series, 2D
images, 3D volumes, 4D videos).
 Designed to process structured data efficiently.

Key Features of CNNs

 (i) Local Receptive Field:

o Each neuron connects only to a small region of the input.
o Helps extract local features like edges, corners.
 (ii) Weight Sharing:
o Same filter (set of weights) is applied across all positions in the input.
o Reduces number of parameters and enables feature detection anywhere in the input.
o A typical convolutional neural network consists of the following layers:
• Convolutional layer
• Activation function layer (ReLU)
• Pooling layer
• Fully connected layer and
• Dropout layer
o These layers are stacked up to make a full ConvNet architecture. Convolutional and
activation function layers are usually stacked together followed by an optional
pooling layer. Fully connected layer makes up the last layer of the network, and the
output of the last fully connected layer produces the class scores of the input image.
In addition to these main layers mentioned above, ConvNet may include optional

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 8

BAI701
layers like batch normalization layer to improve the training time and dropout layer
to address the overfitting issue
 (iii) Subsampling (Pooling):
o Reduces spatial size and network parameters.
o Most common method is max-pooling.

2.5.1 Convolution Layer

 The convolution layer is the main building block of a convolutional neural network (CNN).
 It uses the convolution operation (denoted by *) instead of general matrix multiplication.
 It has a set of learnable filters or kernels as its parameters.
 Its main task is to detect features in local regions of the input image that are common across
the dataset.
 A feature map is created for each filter by convolving it over subregions of the image.
 The process includes performing the convolution, adding a bias term, and applying an
activation function.
 The local receptive field is the region of the input the filter is applied to, and its size matches
the filter size.
 Figure 2.5 illustrates how a T-shaped filter is convolved with the input to get the feature
map.
 After adding the bias, a nonlinear activation function is applied to introduce nonlinearity
into the model.

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 9

BAI701

Filters/Kernels
 The weights in each convolutional layer define the convolution filters (kernels)
 There can be multiple filters in a single convolutional layer.
 Each filter is designed to capture specific features like edges or corners.
 During the forward pass, each filter slides over the input’s width and height to produce its
feature map.
Hyperparameters
 Convolutional neural networks have hyperparameters that control model behavior,
output size, runtime, and memory.
 Four important hyperparameters in the convolution layer are:

 Filter Size: Typically between 3×3 and 11×11. Size is independent of input size.
 Number of Filters: Can vary. For example, AlexNet used 96 filters of size 11×11,
VGGNet used filters of size 7×7 or 11×11.
 Stride: Number of pixels the filter moves at each step. Small stride = more overlap and
larger output size; large stride = less overlap and smaller output size.
 Zero Padding: Number of pixels added as zeros around the input to control the output’s
spatial size.

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 10

BAI701
Each filter in the convolution layer produces a feature map of size ([A−K +2P]/S) + 1 where A is
the input volume size, K is the size of the filter, P is the number of padding applied and S is the
stride. Suppose the input image has size 128 × 128, and 5 filters of size 5 × 5 are applied, with
single stride and zero padding, i.e., A 128, F 5,P 0andS 1.Thenumberoffeaturemapsproduced will
be equal to the number of filters applied, i.e., 5 and the size of each feature map will be ([128 − 5
+0]/1)+1 124. Therefore, the output volume will be 124 × 124 × 5.

2.5.2 Activation Function (ReLU)

 The output of each convolutional layer is passed through an activation function layer.
 The activation function transforms the feature map into an activation map.
 It determines the output signal of a neuron for a given input.
 Activation functions typically squash inputs to a specific range (e.g., 0–1 or −1 to 1).
 They perform a mathematical operation on the input to produce the neuron's activation
level.
 A good activation function is usually continuous and differentiable everywhere.
 Differentiability is important for gradient-based training methods used in ConvNets.
 If non-gradient-based methods are used, differentiability is not required.
 Many activation functions are used in ANNs and some of the commonly used activation
functions are as follows:

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 11

BAI701

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 12

BAI701

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 13

BAI701
2.5.3 Pooling Layer
 Pooling layers follow the convolution and activation layers in ConvNets to reduce the
spatial size of feature maps.
 This reduction lowers the number of parameters and computational cost in the network.
 A pooling layer down-samples the input feature maps by summarizing regions of neurons
to select representative values.
 Max-pooling is the most common technique, dividing the input into small regions (e.g.,
2 × 2) and selecting the maximum value from each region.
 For a 2 × 2 region, max-pooling outputs the single highest value among the four values.
 Other pooling types include average pooling (computes the mean of the region) and L2-
norm pooling (calculates the square root of the sum of squares of the values).
 Pooling layers discard less important details while preserving essential features in a smaller,
more manageable form.
 The idea behind pooling is that detecting a feature is more important than knowing its exact
location.
 This strategy works well for simple tasks but can have limitations for more complex
problems.

2.5.4 Fully Connected Layer

• Convolutional Neural Networks (CNNs) consist of two main stages: feature extraction and
classification.
• The feature extraction stage includes convolution and pooling layers that detect features
from input data.

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 14

BAI701
• Once enough features are extracted, the classification stage begins.
• The classification stage consists of one or more fully connected layers followed by a
classifier.
• Fully connected layers take input from all neurons of the previous layer, enabling every
value to contribute to the prediction.
• These layers transform the spatial feature data into class scores or probabilities.
• Multiple fully connected layers can be used to learn complex feature relationships.
• The output from the last fully connected layer is sent to a classifier.
• Common classifiers used are Softmax and Support Vector Machines (SVMs).
• The Softmax classifier outputs class probabilities that sum to 1.
• The SVM classifier outputs class scores, and the class with the highest score is selected.
2.5.5 Dropout
 Deep neural networks have multiple hidden layers that help learn complex features.
 These are followed by fully connected layers used for decision-making.
 Fully connected layers are prone to overfitting due to their dense connections.
 Overfitting occurs when the model performs well on training data but poorly on new,
unseen data.
 To address overfitting, a dropout layer is used during training.
 Dropout randomly removes some neurons and their connections from the network during
each training iteration.
 The remaining reduced network is trained on the data at that stage.
 Dropped-out neurons are reinserted later with their original weights.
 This technique reduces overfitting and enhances the model's ability to generalize.

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 15

BAI701
2.6 Challenges and Future Research Direction:

 Strong Performance: Convolutional Neural Networks (ConvNets) have shown excellent

results in tasks like object classification and detection, sometimes matching human-level
accuracy.
 Vulnerabilities Exist: Despite their success, ConvNets are vulnerable to small,
imperceptible changes in input images, which can lead to incorrect classifications.
 Cause of Vulnerability: One key reason for this vulnerability is the pooling operation,
which reduces the feature space but also discards important spatial information.
 Loss of Spatial Relationships: ConvNets detect if a feature is present in a region but fail
to capture the exact spatial relationships between features, making it harder to recognize
complex objects.
 Reliability Concern: These limitations raise concerns about the generalization and
reliability of ConvNets in real-world applications.
 Capsule Networks as a Solution: Capsule Networks have been proposed to overcome
some of these issues. They use capsules (groups of neurons) to represent objects and their
parts more precisely.
 Dynamic Routing: Instead of max pooling, Capsule Networks use dynamic routing to
preserve spatial relationships between features across layers.
 Ongoing Research: Capsule Networks are still in the early stages of research, and their
effectiveness across various visual tasks remains under investigation.

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 16

Bai701 DLRL Module 1
No ratings yet
Bai701 DLRL Module 1
53 pages
Sepm Notes
No ratings yet
Sepm Notes
10 pages
Deep Learning Question Bank Iv-I
No ratings yet
Deep Learning Question Bank Iv-I
5 pages
Computational Graphs in Deep Learning Unit v4 Deep Leaerning
No ratings yet
Computational Graphs in Deep Learning Unit v4 Deep Leaerning
3 pages
Neural Network Optimization Guide
No ratings yet
Neural Network Optimization Guide
51 pages
BCS502 Simp Questions
No ratings yet
BCS502 Simp Questions
4 pages
Question Bank - OS
No ratings yet
Question Bank - OS
6 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
96 pages
BAI701 Syllabus
No ratings yet
BAI701 Syllabus
3 pages
BCM601-Module 1
No ratings yet
BCM601-Module 1
35 pages
Module-4 (PDFDrive)
No ratings yet
Module-4 (PDFDrive)
67 pages
UNIT 4 - Perceptron and DL
No ratings yet
UNIT 4 - Perceptron and DL
39 pages
Concept Learning
No ratings yet
Concept Learning
85 pages
Deep Learning - Question Bank
No ratings yet
Deep Learning - Question Bank
6 pages
AI - Question Bank-Module 3, 4 & 5
No ratings yet
AI - Question Bank-Module 3, 4 & 5
8 pages
Btech Cs 7 Sem Deep Learning
No ratings yet
Btech Cs 7 Sem Deep Learning
3 pages
An Introduction To Parallel Programming. Second Edition Peter S. Pacheco Download
No ratings yet
An Introduction To Parallel Programming. Second Edition Peter S. Pacheco Download
100 pages
Robotics and Machine Vision Internal 3 Important Questions
No ratings yet
Robotics and Machine Vision Internal 3 Important Questions
1 page
Unit - II: Recurrent Neural Network
No ratings yet
Unit - II: Recurrent Neural Network
75 pages
NNunit 2
No ratings yet
NNunit 2
25 pages
Parellel Computing Module1 Notes
No ratings yet
Parellel Computing Module1 Notes
29 pages
Microprocessors and Interfacing Devices PDF
No ratings yet
Microprocessors and Interfacing Devices PDF
160 pages
CPR Notes - Chapter 01 Basics of C
No ratings yet
CPR Notes - Chapter 01 Basics of C
20 pages
Week 7 Solution
No ratings yet
Week 7 Solution
6 pages
Parallel Computing-Module2 Notes
No ratings yet
Parallel Computing-Module2 Notes
10 pages
Mobile Computing Technology Applications and Service Creation 2nd Edition Hasan Et Al. Talukder PDF Download
No ratings yet
Mobile Computing Technology Applications and Service Creation 2nd Edition Hasan Et Al. Talukder PDF Download
81 pages
ML Unit-4
No ratings yet
ML Unit-4
40 pages
ML Question Bank
No ratings yet
ML Question Bank
29 pages
Deep Learning r18 Jntuh Lab Manual
No ratings yet
Deep Learning r18 Jntuh Lab Manual
20 pages
Aids - VSB Syllabus 2023 - 16.8.24
No ratings yet
Aids - VSB Syllabus 2023 - 16.8.24
88 pages
Bda Lab Manual - Bad601
No ratings yet
Bda Lab Manual - Bad601
38 pages
AIML Unit 2 Notes
No ratings yet
AIML Unit 2 Notes
49 pages
EE8012 - Soft Computing
No ratings yet
EE8012 - Soft Computing
340 pages
Microprocessor and Interfacing Techniques: (Course Code: CET208A) Credits-3
No ratings yet
Microprocessor and Interfacing Techniques: (Course Code: CET208A) Credits-3
147 pages
Abhinav Jaiswal S CV
No ratings yet
Abhinav Jaiswal S CV
1 page
Web Lab PGM 1 To 3
No ratings yet
Web Lab PGM 1 To 3
11 pages
Jug Problem Python Code DFS Implementation
No ratings yet
Jug Problem Python Code DFS Implementation
7 pages
QB Answers Ia 1 18ai733
No ratings yet
QB Answers Ia 1 18ai733
11 pages
MACHINE LEARNING Important Questions
100% (1)
MACHINE LEARNING Important Questions
2 pages
ML Practice Questions
No ratings yet
ML Practice Questions
6 pages
AI Unit 1.
No ratings yet
AI Unit 1.
15 pages
BCS613A Blockchain Technology Model QP SolvedSearch Creators
No ratings yet
BCS613A Blockchain Technology Model QP SolvedSearch Creators
50 pages
CNN Basics for AI Enthusiasts
No ratings yet
CNN Basics for AI Enthusiasts
29 pages
Question Bank AML
No ratings yet
Question Bank AML
4 pages
Branch and Bound
No ratings yet
Branch and Bound
30 pages
Soft Computing Lab Manual
No ratings yet
Soft Computing Lab Manual
24 pages
Fuzzy Logic and Neural Networks Notes
No ratings yet
Fuzzy Logic and Neural Networks Notes
68 pages
ML Lab Manual for CS Students
No ratings yet
ML Lab Manual for CS Students
62 pages
@vtucode - in Module 4 AI 2021 Scheme 5th Sem
No ratings yet
@vtucode - in Module 4 AI 2021 Scheme 5th Sem
11 pages
AI Unit2 ProblemSolving
No ratings yet
AI Unit2 ProblemSolving
191 pages
CS6456-Object Oriented Programming
No ratings yet
CS6456-Object Oriented Programming
15 pages
18AI61
No ratings yet
18AI61
3 pages
Sample Report 22-23 1
No ratings yet
Sample Report 22-23 1
30 pages
Thyroid Disease Classification Using Machine Learning Project
No ratings yet
Thyroid Disease Classification Using Machine Learning Project
34 pages
Module-02 AIML NOTES
No ratings yet
Module-02 AIML NOTES
29 pages
Q1) Explain Greedy Strategy - Principle Control Abstraction, Time Analysis
No ratings yet
Q1) Explain Greedy Strategy - Principle Control Abstraction, Time Analysis
5 pages
Iv Semester: Data Mining Question Bank: Unit 2 2 Mark Questions)
No ratings yet
Iv Semester: Data Mining Question Bank: Unit 2 2 Mark Questions)
5 pages
Seminar Report Explainable Ai
No ratings yet
Seminar Report Explainable Ai
33 pages
DLRL Module 2
No ratings yet
DLRL Module 2
22 pages
19 ResNet 10 09 2024
No ratings yet
19 ResNet 10 09 2024
35 pages
Module 2
No ratings yet
Module 2
19 pages
DLRL Module 1
No ratings yet
DLRL Module 1
20 pages
Module 1 BA Notes
No ratings yet
Module 1 BA Notes
18 pages
Convolution Operation Solution
No ratings yet
Convolution Operation Solution
4 pages
BAI701 - DLRL - Question Bank (Module 1 & 2)
No ratings yet
BAI701 - DLRL - Question Bank (Module 1 & 2)
3 pages
DL QB
No ratings yet
DL QB
1 page
QB 1st IA
No ratings yet
QB 1st IA
2 pages
Module1 Smlds Bad702 Notes
No ratings yet
Module1 Smlds Bad702 Notes
29 pages
Department of CSE (Data Science) : Statistical Machine Learning For Data Science (BAD702-IPCC)
No ratings yet
Department of CSE (Data Science) : Statistical Machine Learning For Data Science (BAD702-IPCC)
78 pages
Department of CSE (Data Science) : Statistical Machine Learning For Data Science (BAD702-IPCC)
No ratings yet
Department of CSE (Data Science) : Statistical Machine Learning For Data Science (BAD702-IPCC)
61 pages
Module-IV HIVE
No ratings yet
Module-IV HIVE
69 pages
02.MOUDLE 5 - Text Mining
No ratings yet
02.MOUDLE 5 - Text Mining
27 pages
Food Application
No ratings yet
Food Application
22 pages
Grape Leaf p2 Final
No ratings yet
Grape Leaf p2 Final
28 pages
SPIRAL Binding Report (2) - 1
No ratings yet
SPIRAL Binding Report (2) - 1
46 pages
Books - Sheet1
No ratings yet
Books - Sheet1
2 pages
FPUS23: An Ultrasound Fetus Phantom Dataset With Deep Neural Network Evaluations For Fetus Orientations, Fetal Planes, and Anatomical Features
No ratings yet
FPUS23: An Ultrasound Fetus Phantom Dataset With Deep Neural Network Evaluations For Fetus Orientations, Fetal Planes, and Anatomical Features
10 pages
EPITA Master of Science in Data Science Analytics - 2020
No ratings yet
EPITA Master of Science in Data Science Analytics - 2020
2 pages
Artificial Intelligence & Machine Learning: Post Graduate Program in
No ratings yet
Artificial Intelligence & Machine Learning: Post Graduate Program in
16 pages
Deep Learning
No ratings yet
Deep Learning
19 pages
A Deep Learning Approach To The Geometry Friends Game (Artículo)
No ratings yet
A Deep Learning Approach To The Geometry Friends Game (Artículo)
10 pages
Deep Learning
No ratings yet
Deep Learning
6 pages
Lecture 1 - Intro
100% (1)
Lecture 1 - Intro
57 pages
Respond Basket 2023
No ratings yet
Respond Basket 2023
252 pages
Wei-Li Kao: Contact: (+886) 905-681-180 Email
No ratings yet
Wei-Li Kao: Contact: (+886) 905-681-180 Email
1 page
Intro to Machine Learning Course
No ratings yet
Intro to Machine Learning Course
68 pages
Image Based Virtual Try On Network
No ratings yet
Image Based Virtual Try On Network
4 pages
Neurocomputing: Guotai Wang, Wenqi Li, Michael Aertsen, Jan Deprest, Sébastien Ourselin, Tom Vercauteren
No ratings yet
Neurocomputing: Guotai Wang, Wenqi Li, Michael Aertsen, Jan Deprest, Sébastien Ourselin, Tom Vercauteren
12 pages
Multi-Task Full Attention U-Net For Prestack Seismic Inversion
No ratings yet
Multi-Task Full Attention U-Net For Prestack Seismic Inversion
5 pages
10 35377-Saucis 1418505-3655169
No ratings yet
10 35377-Saucis 1418505-3655169
16 pages
SignExplainer An Explainable AI-Enabled Framework For Sign Language Recognition With Ensemble Learning
No ratings yet
SignExplainer An Explainable AI-Enabled Framework For Sign Language Recognition With Ensemble Learning
10 pages
Big Ideas 2020-Final - 011020 PDF
No ratings yet
Big Ideas 2020-Final - 011020 PDF
81 pages
M.Tech for IT Professionals
No ratings yet
M.Tech for IT Professionals
18 pages
Convolution Neural Network
No ratings yet
Convolution Neural Network
4 pages
Diffusion Policy Attacker: Crafting Adversarial Attacks For Diffusion-Based Policies
No ratings yet
Diffusion Policy Attacker: Crafting Adversarial Attacks For Diffusion-Based Policies
16 pages
COS40007 Design Project
No ratings yet
COS40007 Design Project
11 pages
Crime Prediction Using Machine Learning and Deep L
No ratings yet
Crime Prediction Using Machine Learning and Deep L
21 pages
2204 11io076v1
No ratings yet
2204 11io076v1
27 pages
Ai Full Stack
No ratings yet
Ai Full Stack
15 pages
Data Ethics: Principles and Practices
No ratings yet
Data Ethics: Principles and Practices
58 pages
Image Processing 2024
No ratings yet
Image Processing 2024
4 pages
Application of Artificial Intelligence in Additive Manufacturing-A Review
No ratings yet
Application of Artificial Intelligence in Additive Manufacturing-A Review
12 pages

Module 2

Uploaded by

Module 2

Uploaded by

BAI701

Chapter 1: Basics of Supervised Deep Learning

2.2 Convolutional Neural Network (ConvNet/CNN)

2.3 Evolution of Convolutional Neural Network Models

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 1

o An improved version of CNN architecture by reducing the first-layer filter size

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 2

In order to compensate for additional computational complexity due to extra convolutional

To address this, network depth should be increased carefully.

GoogLeNet won ILSVRC 2015 with a 6.7% error rate.

Later versions include Inception V3 (2016) and Inception-ResNet (2017).

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 3

A convolutional neural network architecture based on depthwise separable convolution layers

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 4

SqueezeNet: Researchers developed SqueezeNet to reduce the size and complexity of

 (a) Replacing 3 × 3 filters with 1 × 1 filters.

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 5

2.4 Convolution Operation

Convolution is a mathematical operation performed on two functions and is written as (f * g),

For time-domain functions, n is replaced by t. The convolution operation is com mutative in

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 6

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 7

Traditional Neural Network Limitations

Need for CNN

Key Features of CNNs

 (i) Local Receptive Field:

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 8

2.5.1 Convolution Layer

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 9

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 10

2.5.2 Activation Function (ReLU)

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 11

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 12

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 13

2.5.4 Fully Connected Layer

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 14

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 15

 Strong Performance: Convolutional Neural Networks (ConvNets) have shown excellent

Pooja R Rao, Asst. Professor, Dept of CSE(DS), RNSIT. Page 16

You might also like