Official (Closed) - Non Sensitive
Deep Learning in Image
Recognition
Lecture :
Use cases of Convolutional Neural
Networks and Implementation
• Specialist Diploma in Applied
Generative AI
• Academic Year 2024/25
Official (Closed) - Non Sensitive
Topics
1. Recap of Neural Network and
Convolutional Neural Network
2. Use Cases of Convnet
3. Gradio
Official (Closed) - Non Sensitive
1. Recap of Neural Network and
Convolutional Neural Network
Official (Closed) - Non Sensitive
Definition of AI: the effort to automate intellectual
tasks normally performed by humans
AI is a general field that encompasses machine learning
and deep learning, and many other approaches which
don’t involve any learning.
Official (Closed) - Non Sensitive
Anatomy of a Input
neural network
1. Layers:
combined into a Hidden layers
model (containing
neurons/nodes)
2. Input Data and
corresponding
targets / labels
3. Optimizer:
determines how Predicted Actual
learning Output Output
proceeds
Official (Closed) - Non Sensitive
An Example (refer to
NN_Computation.xlsx
Input Layer 1 forLayer
details)
2 Output
Credit loan
ReLU Sigmoid/logistic
Age Loan approve = 1
Loan reject = 0
Prediction True Target
Y’ Y
Salary
0.5329 1.0
Loss Score
Education = Y- Y’
=0.4671
Forward Propagation
Official (Closed) - Non Sensitive
Input Data
Build the model (“network”)
Official (Closed) - Non Sensitive
Compile
Training
Evaluate
Official (Closed) - Non Sensitive
1. Introduction to CNN
https://www.youtube.com/watch?v=Gu0MkmynWkw
Official (Closed) - Non Sensitive
1. Introduction to CNN
The MNIST classification problem using CNN
Convolution
operation
Max-pooling
operation
Official (Closed) - Non Sensitive
2.1 The convolution operation
Refer to excel
A Simple Example of 2D convolution spreadsheet
for details
Input Image: (5, 5, 1) Output (3,
3, 1)
One Filter: (3, 3)
Pixel value
Filter
weight
0 1 2
2 2 0
0 1 2
Source:
https://towardsdatascience.com/intuitively-understanding-convolutions-for-deep-learning-1f6f42faee1
Official (Closed) - Non Sensitive
2.1 The convolution operation
Padding
Input Image: (5, 5, 1)
One Filter: (3, 3)
Padding = 1
Output (5, 5, 1)
Source:
https://towardsdatascience.com/intuitively-understanding-convolutions-for-deep-learning-1f6f42faee1
Official (Closed) - Non Sensitive
2.1 The convolution operation
Strides
Input Image: (5, 5, 1)
One Filter: (3, 3)
Strides = 2
Output (2, 2, 1)
Source:
https://towardsdatascience.com/intuitively-understanding-convolutions-for-deep-learning-1f6f42faee1
Official (Closed) - Non Sensitive
2.1 The convolution operation
Dot product of image with filter and add
across different filter with a bias
(-2)+(-1)+(0)+0=-3
Input width =5
Padding = 1
Strides = 2
TO CALCULATE WIDTH OF OUTPUT
Output Depth= # of Filters = 2
Output Height or Width =
(Input width + (2*Padding) - Filter width) / Strides + 1
= (5 + 2*1 - 3) / 2 + 1 = 3
Official (Closed) - Non Sensitive
2.2 The max-pooling operation
Down sample the tensors by taking the max value
in a window (e.g. 2x2)
Source:
https://computersciencewiki.org/index.php/Max-pooling_/_Pooling
Official (Closed) - Non Sensitive
4.2 To prevent overfitting
1.Reducing network size (tweak
hyperparameters)
2.Adding weight regularization
3.Adding dropout
4.Get more training data
Official (Closed) - Non Sensitive
Using a pre-trained CNN
We will use a large CNN trained on the
ImageNet dataset, which has:
1. 1.4 million labeled images and 1,000
different classes
2. Many animal classes, including different
species of cats and dogs
The well-known image processing
models:
1. VGG 16 (we will use this one) and VGG 19
2. Inception V3, ResNet50, Xception,
MobileNet
3. All available in keras.applications
Official (Closed) - Non Sensitive
pre-trained CNN
Our own classifier vs VGG16 classifer
Our own:
Cat & Dog classifier
(2 classes)
VGG model:
FC classifier
1000 classes
Official (Closed) - Non Sensitive
pre-trained CNN
Objective: use VGG16 model and train our own FC
classifier
Conv block 1 Conv block 2 Conv block 3 Conv block 4 Conv block 5
(frozen) (frozen) (frozen) (frozen) (frozen)
Conv_Base = VGG16 – FC Classifier FC Classifier
Official (Closed) - Non Sensitive
Going beyond
Sequential model
Xception CNN
Official (Closed) - Non Sensitive
2. Use cases of Convnet
Official (Closed) - Non Sensitive
Medical Diagnostics
1. Tumor detection, anomaly
detection
Official (Closed) - Non Sensitive
Autonomous Driving
Three types of input sensors:
- Camera
- Light Detection And Ranging (LiDAR)
- Radar
https://arxiv.org/pdf/1910.07738
Official (Closed) - Non Sensitive
Autonomous Driving
Official (Closed) - Non Sensitive
Autonomous Driving
Official (Closed) - Non Sensitive
Video Analysis
1. Action recognition, video segmentation
Video analytics module - fight and weapon detection. Attendance Management System Using Face Recognition
https://www.youtube.com/watch?v=_zpHzxAQQ2o https://www.youtube.com/watch?v=EHgjYXWtaIs
Demo: Teachable Machine
Official (Closed) - Non Sensitive
3. Gradio
https://www.gradio.app/guides/quickstart
Official (Closed) - Non Sensitive
What is Gradio?
Open-source Python package that builds demo or
web application
Share what you built
Official (Closed) - Non Sensitive
Gradio
Gradio includes pre-built components that can be used as inputs
or outputs in your Interface with a single line of code.
Components include preprocessing steps that convert user data
submitted through browser to something that be can used by a
Python function, and postprocessing steps to convert values
returned by a Python function into something that can be
displayed in a browser.
Official (Closed) - Non Sensitive
Gradio
Official (Closed) - Non Sensitive
Let’s try your hands on visualizing our model!
(Practical 5: Building a UI using Gradio)
Official (Closed) - Non Sensitive
Wrapping up
We had a recap of what was
covered
Real life use cases
Building your own UI using
Gradio
Official (Closed) - Non Sensitive
Next Week Quiz and Assignment
Consultation