0% found this document useful (0 votes)

13 views17 pages

Convnets 2

The document provides an overview of convolutional neural networks (CNNs), detailing the structure and function of convolutional layers, activation functions like ReLU, pooling layers, and fully connected layers. It explains how to combine these layers to process input data, using examples such as the MNIST dataset for digit recognition. Additionally, it includes Python code for implementing a CNN using the Keras library, demonstrating the training and evaluation of the model.

Uploaded by

tingtkang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views17 pages

Convnets 2

Uploaded by

tingtkang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

» Convolutional Layer

∗ Recall we can apply several filters to the same input and stack
their outputs together. E.g.

kernel 3 × 3 × 3

* =

output w[1] ∗ a[0]

(30 × 30 × 2)
input a[0]
(32 × 32 × 3) kernel 3 × 3 × 3

∗ To get a complete convolutional layer we pass the elements of

the output through a nonlinearity, usually after adding a bias.
∗ Kernel weights w[1] , input a[0] , bias/offset b[1] (weights w[1] and
bias b[1] are unknown parameters that need to be learned).
∗ After convolution output is w[1] ∗ a[0]
∗ Add bias to get z[1] = w[1] ∗ a[0] + b[1]
∗ Final output a[1] = g(z[1] ), for nonlinear activation function g(·).
Note: g(·) is applied separately to each element of z[1] .
» Choice of Activation Function g(·)

{
x x≥0
∗ ReLU (Rectified Linear Unit) g(x) =
0 x<0
∗ Almost universally used nowadays (older choices were sigmoid
and tanh). Quick to compute, observed to work pretty well.
∗ But can lead to “dead” neurons where output is always zero
→ leaky ReLU
2
ReLU
activation function f(x)

1.5 Sigmoid
tanh
1

0.5

-0.5

-1
-2 -1 0 1 2
x
» Combining Convolutional Layers
∗ We can use the output from one convolution layer as the input
to another convolution layer
∗ E.g. Suppose input to first layer is 32 × 32 × 3 and convolve
this with 16 kernels of size 3 × 3 × 3 → output is 30 × 30 × 16
∗ Now use this 30 × 30 × 16 output as input to a second layer
with 8 kernels fo size 3 × 3 × 16 → output is 28 × 28 × 8 tensor
∗ All layers use ReLU activation function. Stride is 1.
∗ Typical way of drawing this schematically:

→ → →
conv conv
3 × 3, 16 3 × 3, 8
input 30 × 30 × 16 28 × 28 × 8
32 × 32 × 3

∗ Notes:
∗ “conv 3 × 3, 16” means convolutional layer with 3 × 3 kernel and
16 output channels.
∗ Number of channels in each kernel must match number of input
channels e.g. 3 × 3 × 3 for 3 input channels and 3 × 3 × 16 for 16
input channels, no choice here. So usually abbreviate to 3 × 3.
∗ Depth of cube roughly indicates #output channels.
» Combining Convolutional Layers

Some more notes:

∗ No padding used, so output is smaller than input. Could keep
the same using padding.
∗ Number of kernel weights/parameters for first layer is
16 × 3 × 3 × 3 = 432, and for second layer 8 × 3 × 3 × 16 = 1152
∗ Using equations:
∗ Input a[0] to first layer, output is a[1] = g(w[1] ∗ a[0] + b[1] )
∗ Input a[1] to second layer, output is a[2] = g(w[2] ∗ a[1] + b[2] )
where w[1] , w[2] are layer kernel weights, b[1] , b[2] layer bias
parameters and g(·) is ReLU.
» Pooling Layer

∗ Pooling layers are used to reduce the size of matrices in tensor

∗ E.g. Suppose want to downsample 4 × 4 matrix to 2 × 2 matrix:
1 2 3 4
1 3 2 3
→
3 2 1 4
6 1 1 2

∗ Use max-pooling with 2 × 2 block size and stride 2:

1 2 3 4
1 3 2 3 3 4
→
3 2 1 4 6 4
6 1 1 2

1. Partition input matrix into 2 × 2 blocks, stride of 2 means blocks

don’t overlap.
2. Calculate value of max element in each block.
3. Use max as value of corresponding output element.
» Pooling Layer

∗ E.g. Max-pooling with 3 × 3 block size and stride 1:

1 2 3 4 1 2 3 4
1 3 2 3 3 1 3 2 3 3 4
→ →
3 2 1 4 3 2 1 4
6 1 1 2 6 1 1 2

1 2 3 4 1 2 3 4
1 3 2 3 3 4 1 3 2 3 3 4
→ →
3 2 1 4 6 3 2 1 4 6 4
6 1 1 2 6 1 1 2

∗ But mostly use stride=block size → no overlap between blocks

∗ Pooling block size and stride must be chosen compatible with
size of input matrix
∗ As well as max-pooling there is average pooling → output is
average of elements in a block. But rarely used.
» Down-sampling Using Strided Convolution
∗ Recall that we can use strides > 1 in a convolutional layer → also reduces
size of output
1 -1
∗ E.g. Applying 2 × 2 kernel
1 -1
with stride 2:
1 -1 1 -1
1 2 3 4 1 2 3 4
1 -1 -3 1 -1 -3 -2
1 3 2 3 → 1 3 2 3 →
3 2 1 4 3 2 1 4
6 1 1 2 6 1 1 2

1 2 3 4 1 2 3 4
1 3 2 3 1 3 2 3
1 -1
-3 -2 1 -1
-3 -2
→ →
3 2 1 4 6 3 2 1 4 6 -4
1 -1 1 -1
6 1 1 2 6 1 1 2
→ for 4 × 4 input the output is reduced to 2 × 2
∗ Often works well, e.g. see Striving For Simplicity: The All Convolutional Net
https://arxiv.org/pdf/1412.6806.pdf
∗ Not quite the same as using (2,2) kernel with stride 1 and same padding
followed by (2,2) max-pooling:
∗ (2,2) kernel with stride 1 and same padding does 16 convolutions
whereas (2,2) kernel with stride 2 calcs only 4 convolutions (so faster,
computationally cheaper)
∗ Max-pooling combines info from all 4 convolutions involving 2 × 2 block
whereas (2,2) kernel with stride 2 only uses info from 1 convolution per
2 × 2 block (uses less info)
» Fully-Connected Layer
∗ Fully-connected (FC) layer = one layer of MLP. Called dense
layer in keras.
∗ Each output is a function of a weighted sum of all of the inputs
∗ Input is vector x (not a tensor or matrix). Output is y = f(wT x),
w are weights/parameters, f(·) is nonlinear function.
x1
f y
x2

..
.

∗ If input is output from a convolution layer, i.e. a tensor, need

to flatten it before it can be used as input to FC layer.
∗ flattening → take all elements of tensor and write them as a
list/array [ ] [ ]
1 2 4 5
∗ e.g. two channels , → [1, 2, 3, 4, 4, 5, 6, 7].
3 4 6 7
» Fully-Connected Layer
∗ A FC-layer can have multiple outputs e.g. Input x and two
output y1 = f(wT x), y2 = f(vT x). Here w is weight vector for y1 ,
v the weight vector for y2 .
x1
f y1
x2

..
.
f y2
xn
∗ If input vector x has n elements and have m outputs then
FC-layer has n × m parameters.
∗ Suppose have h0 × w0 × c0 input and h1 × w1 × c1 output.
∗ Convolution layer has c1 × k × k × c0 parameters for k × k kernel
∗ FC-layer has h0 × w0 × c0 × h1 × w1 × c1 parameters
∗ h0 = w0 = 32, c0 = 32, h1 = w1 = 32, c1 = 32, conv 3 × 3 layer
has 9216 parameters, FC layer has 109 parameters.
∗ Common to use FC-layer as the last layer in a ConvNet i.e. the
layer which generates the (smallish number of) final outputs.
∗ How to choose nonlinear function f(·)?
∗ Common choice: softmax.
∗ Recall softmax = multi-class logistic regression model.
» Convolutional Network Example

MNIST Dataset1
∗ Training data: 60K images of handwritten digits 0-9. Test
data 10K images
∗ Each image is 28 × 28 pixels, gray scale
∗ Task is to predict which digit an image shows.
∗ Widely studied, relatively easy task. Best performance to
date is 99.8% accuracy using ConvNet

1 https://en.wikipedia.org/wiki/MNIST_database#cite_note-Gradient-9
» Convolutional Network Example

→ →
conv conv →
3 × 3, 32 3 × 3, 64 softmax
stride 2 13 × 13, 32 stride 2
input 6 × 6, 64
28 × 28, 1

∗ Uses strides to downsample the image.

∗ Input 28 × 28 × 1 → 13 × 13 × 32 → 6 × 6 × 64
∗ Number of channels increases as we move through network (1 → 32 → 64),
size of image decreases (28 × 28 → 13 × 13 → 6 × 6)
∗ We use final softmax layer/logistic regression to map from ConvNet features
to final output (flatten step not shown in schematic)
∗ Output is 10 × 1 → there are 10 classes, corresponding to digits 0-9,
elements of output vector are probability of each class. To make
prediction pick the class with highest probability.
» Convolutional Network Example
∗ We’ll use Python keras package for ConvNets (its a front end to tensorflow)
import numpy as np
from tensorflow import keras
from tensorflow.keras import regularizers
from keras.layers import Dense, Flatten, Conv2D, MaxPooling2D, Dropout

num_classes = 10
input_shape = (28, 28, 1)

# Load MNIST dataset

(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
# Scale images to the [0, 1] range
x_train = x_train.astype(”float32”) / 255
x_test = x_test.astype(”float32”) / 255
# Make sure images have shape (28, 28, 1)
x_train = np.expand_dims(x_train, −1)
x_test = np.expand_dims(x_test, −1)

model = keras.Sequential()
#3x3 kernel with stride 2, 32 output channels.
model.add(Conv2D(32, kernel_size=(3, 3), strides=(2,2), input_shape=input_shape, activation=”relu”))
#3x3 kernel with stride 2, 64 output channels.
model.add(Conv2D(64, kernel_size=(3, 3), strides=(2,2), activation=”relu”))
# use CNN output as input to a Logistic regression classifier. Regularise logistic loss with L2 penalty.
model.add(Flatten())
model.add(Dense(num_classes, activation=’softmax’,activity_regularizer=regularizers.l2(0.01)))
model.summary()

model.compile(loss=”categorical_crossentropy”, optimizer=’adam’, metrics=[”accuracy”])

model.fit(x_train, y_train, batch_size=32, epochs=5, validation_split=0.2)
score = model.evaluate(x_test, y_test, verbose=0)
print(”Test loss: %f accuracy: %f”%(score[0],score[1]))

∗ Note: use regularisation on FC-layers but usually not on convolutional layers.

Why?
» Convolutional Network Example
∗ Typical output:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 13, 13, 32) 320
_________________________________________________________________
conv2d_1 (Conv2D) (None, 6, 6, 64) 18496
_________________________________________________________________
flatten (Flatten) (None, 2304) 0
_________________________________________________________________
dense (Dense) (None, 10) 23050
=================================================================
Total params: 41,866
Trainable params: 41,866
Non−trainable params: 0
_________________________________________________________________
Epoch 1/5
3000/3000 [==============================] − 6s 2ms/step − loss: 0.1927 − accuracy: 0.9447 − val_loss: 0.0916 −
val_accuracy: 0.9765
Epoch 2/5
3000/3000 [==============================] − 6s 2ms/step − loss: 0.0788 − accuracy: 0.9788 − val_loss: 0.0755 −
val_accuracy: 0.9814
Epoch 3/5
3000/3000 [==============================] − 6s 2ms/step − loss: 0.0584 − accuracy: 0.9850 − val_loss: 0.0700 −
val_accuracy: 0.9820
Epoch 4/5
3000/3000 [==============================] − 6s 2ms/step − loss: 0.0466 − accuracy: 0.9882 − val_loss: 0.0723 −
val_accuracy: 0.9819
Epoch 5/5
3000/3000 [==============================] − 6s 2ms/step − loss: 0.0384 − accuracy: 0.9908 − val_loss: 0.0616 −
val_accuracy: 0.9858
Test loss: 0.051263 accuracy: 0.987100

∗ Achieves 98.7% accuracy on test data, model takes about 30s to train
∗ Baseline for comparison:
∗ Logistic regression: 73s to train, achieves 92% accuracy
∗ Kernelised SVM: 711s to train, achieves 94% accuracy
» Convolutional Network Example

∗ Can also use dropouts rather than L2 penalty for regularisation → using
dropouts is popular in ConvNets
model = keras.Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), strides=(2,2), input_shape=input_shape, activation=”relu”))
model.add(Conv2D(64, kernel_size=(3, 3), strides=(2,2), activation=”relu”))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(num_classes, activation=’softmax’))

∗ Again, note that use regularisation on FC-layers but usually not on

convolutional layers.
» Convolutional Network Example

An alternative (but v similar) architecture:

→ → →
conv →
max-pool conv →
3 × 3, 32 max-pool softmax
(2, 2) 3 × 3, 64
14 × 14, 32 (2, 2) 7 × 7, 64
input 28 × 28, 32 14 × 14, 64
28 × 28, 1

∗ Use “same” padding in conv layers → output is same size as input.

∗ Use max-pool to downsample, stride=kernel size=2
∗ 28 × 28 × 1 → 28 × 28 × 32 → 14 × 14 × 32 → 14 × 14 × 64 → 8 × 8 × 64
∗ Using same padding plus max-pool like this is currently popular ... but that
might well change
∗ Python keras code:
model = keras.Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), input_shape=input_shape, padding=”same”,activation=”relu”))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, kernel_size=(3, 3), padding=”same”, activation=”relu”))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(num_classes, activation=’softmax’,activity_regularizer=regularizers.l2(0.01)))
model.summary()
» Convolutional Network Example
∗ Typical output:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 28, 28, 32) 320
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 32) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 14, 14, 64) 18496
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 7, 7, 64) 0
_________________________________________________________________
flatten (Flatten) (None, 3136) 0
_________________________________________________________________
dense (Dense) (None, 10) 31370
=================================================================
Total params: 50,186
Trainable params: 50,186
Non−trainable params: 0
_________________________________________________________________
Epoch 1/5
3000/3000 [==============================] − 21s 7ms/step − loss: 0.1490 − accuracy: 0.9565 − val_loss: 0.0627 −
val_accuracy: 0.9854
Epoch 2/5
3000/3000 [==============================] − 22s 7ms/step − loss: 0.0570 − accuracy: 0.9850 − val_loss: 0.0527 −
val_accuracy: 0.9886
Epoch 3/5
3000/3000 [==============================] − 22s 7ms/step − loss: 0.0432 − accuracy: 0.9898 − val_loss: 0.0567 −
val_accuracy: 0.9849
Epoch 4/5
3000/3000 [==============================] − 22s 7ms/step − loss: 0.0345 − accuracy: 0.9920 − val_loss: 0.0504 −
val_accuracy: 0.9877
Epoch 5/5
3000/3000 [==============================] − 21s 7ms/step − loss: 0.0284 − accuracy: 0.9941 − val_loss: 0.0474 −
val_accuracy: 0.9901
Test loss: 0.044836 accuracy: 0.989100

∗ Achieves 98.9% accuracy on test data

∗ Takes 100s to train (longer than when use strides to downsample, why?)
» Cross-validation
∗ Training by minimising cost function and using
cross-validation to select hyperparameters (not just
regularisation penalty but also number of convolutional
output channels etc) is best practice
∗ But ...
∗ ... it often takes ages to train ConvNets. Even in above v easy
example it takes a minute or so, with bigger networks and
more data training can easily take days even with a good GPU
rig
∗ So k-fold cross-validation usually impractical, just takes too
long
∗ Instead often just keep a hold-out test set and use that to
evaluate hyperparameter choices. Also often only evaluate
only a few hyperparameter values as otherwise takes too long.
∗ Its not great, but we have little choice. Also means you can
see many conflicting/random views on web for how to
approach the same ML task.

Convolutional Neural Networks - Part 2
No ratings yet
Convolutional Neural Networks - Part 2
49 pages
MLOA Exp 1 - C121
No ratings yet
MLOA Exp 1 - C121
18 pages
Lec5 CNN RNN Attention
No ratings yet
Lec5 CNN RNN Attention
71 pages
Unit Iii Deep Learning
No ratings yet
Unit Iii Deep Learning
31 pages
Intro to CNNs for Image Processing
No ratings yet
Intro to CNNs for Image Processing
52 pages
CNN Layers and Operations Explained
No ratings yet
CNN Layers and Operations Explained
17 pages
Convnets 1
No ratings yet
Convnets 1
21 pages
CS601 Machine Learning Unit 3
No ratings yet
CS601 Machine Learning Unit 3
47 pages
Convolutional Neural Networks - Part 1
No ratings yet
Convolutional Neural Networks - Part 1
44 pages
Unit IV Deep Leraning
No ratings yet
Unit IV Deep Leraning
35 pages
CNN Concepts for Computer Science Students
No ratings yet
CNN Concepts for Computer Science Students
15 pages
NN 07
No ratings yet
NN 07
24 pages
Unit 3 - Machine Learning
No ratings yet
Unit 3 - Machine Learning
29 pages
CNN Architecture
No ratings yet
CNN Architecture
24 pages
WINSEM2024-25 BMEE407L TH VL2024250503563 2025-03-28 Reference-Material-I
No ratings yet
WINSEM2024-25 BMEE407L TH VL2024250503563 2025-03-28 Reference-Material-I
36 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
38 pages
AE556 2024 Topic4 CNN
No ratings yet
AE556 2024 Topic4 CNN
26 pages
Building A Convolutional Neural Network Using Tensorflow Keras
No ratings yet
Building A Convolutional Neural Network Using Tensorflow Keras
10 pages
Deep Learning Section5 (2023)
No ratings yet
Deep Learning Section5 (2023)
17 pages
Unit III
No ratings yet
Unit III
89 pages
Unit III
No ratings yet
Unit III
89 pages
DL Mod3
No ratings yet
DL Mod3
102 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
32 pages
CH VI - Convolutional Neural Network - 24
No ratings yet
CH VI - Convolutional Neural Network - 24
33 pages
CNN Guide for Machine Learning Students
No ratings yet
CNN Guide for Machine Learning Students
37 pages
DeepLearning Unit-II
No ratings yet
DeepLearning Unit-II
48 pages
Introduction To Convolution Neural Network
No ratings yet
Introduction To Convolution Neural Network
15 pages
CNN Basics for AI Enthusiasts
No ratings yet
CNN Basics for AI Enthusiasts
6 pages
Machine Learning for Data Scientists
No ratings yet
Machine Learning for Data Scientists
14 pages
Lecture 3
No ratings yet
Lecture 3
48 pages
Selected Algorithm: Mse N y y
No ratings yet
Selected Algorithm: Mse N y y
2 pages
3.3 - CNNs
No ratings yet
3.3 - CNNs
29 pages
Convolutional Neural Network - Towards Data Science PDF
No ratings yet
Convolutional Neural Network - Towards Data Science PDF
10 pages
Assignment No 2 - OCR CNN
No ratings yet
Assignment No 2 - OCR CNN
2 pages
5-Convolutional Neural Network
No ratings yet
5-Convolutional Neural Network
43 pages
02 CNN Slides
No ratings yet
02 CNN Slides
77 pages
Week 7
No ratings yet
Week 7
24 pages
02 Aibds II CNN ST 2025
No ratings yet
02 Aibds II CNN ST 2025
25 pages
Unit 3 - Machine Learning
No ratings yet
Unit 3 - Machine Learning
27 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
37 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
35 pages
Convolutional Networks 2024
No ratings yet
Convolutional Networks 2024
44 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
11 pages
ANN Unit 4
No ratings yet
ANN Unit 4
66 pages
1 CNN
No ratings yet
1 CNN
14 pages
Unit3 2023 NNDL
No ratings yet
Unit3 2023 NNDL
69 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
11 pages
CNN Slides PDF
No ratings yet
CNN Slides PDF
81 pages
UNIT-III Convolution Neural Networks
No ratings yet
UNIT-III Convolution Neural Networks
9 pages
Convolutional Neural Networks: 1. Basics of Cnns
No ratings yet
Convolutional Neural Networks: 1. Basics of Cnns
8 pages
Unit II
No ratings yet
Unit II
38 pages
Convolutional Layer Examples
No ratings yet
Convolutional Layer Examples
69 pages
DL Unit-3
No ratings yet
DL Unit-3
70 pages
02 CNN LN
No ratings yet
02 CNN LN
5 pages
Lecture 2 CNN
No ratings yet
Lecture 2 CNN
105 pages
MLT UNIT-4 & 5 Imp Sol
No ratings yet
MLT UNIT-4 & 5 Imp Sol
22 pages
CNN Algorithm Implementation Lab
No ratings yet
CNN Algorithm Implementation Lab
5 pages
Cross Validation
No ratings yet
Cross Validation
22 pages
Effective Reading
No ratings yet
Effective Reading
6 pages
Taking Notes: What Reasons Do You Give To Yourself For Taking Notes?
No ratings yet
Taking Notes: What Reasons Do You Give To Yourself For Taking Notes?
4 pages
Causal Interpretability For Machine Learning
No ratings yet
Causal Interpretability For Machine Learning
16 pages
A Multivariate Heavy-Tailed Integer-Valued GARCH Process With EM
No ratings yet
A Multivariate Heavy-Tailed Integer-Valued GARCH Process With EM
21 pages
Hafed TPP
No ratings yet
Hafed TPP
49 pages
HD WF4 Specification V6.0.1
100% (1)
HD WF4 Specification V6.0.1
5 pages
Magnesium Alloys Containing Rare Earth Metals Structure and Properties 1st Edition L.L. Rokhlin (Author) Download PDF
100% (13)
Magnesium Alloys Containing Rare Earth Metals Structure and Properties 1st Edition L.L. Rokhlin (Author) Download PDF
84 pages
Xycris - Marketing Plan
75% (12)
Xycris - Marketing Plan
73 pages
Biogas: Rural India's Energy Solution
67% (3)
Biogas: Rural India's Energy Solution
37 pages
GlobalEnglish 3 TB
0% (1)
GlobalEnglish 3 TB
218 pages
Classic Metallic Brochure 2010
No ratings yet
Classic Metallic Brochure 2010
24 pages
Digital Number Systems Guide
No ratings yet
Digital Number Systems Guide
12 pages
SK Faustino Annual Budget 2025
No ratings yet
SK Faustino Annual Budget 2025
3 pages
English and Chinese Reader
No ratings yet
English and Chinese Reader
299 pages
CSC Books
No ratings yet
CSC Books
20 pages
Abrasive Water Jet Machining
No ratings yet
Abrasive Water Jet Machining
30 pages
AS350 B2 - Intro
100% (1)
AS350 B2 - Intro
25 pages
Netlab Cyberops Associate Pod
No ratings yet
Netlab Cyberops Associate Pod
25 pages
Grade 3 Science and English Test
No ratings yet
Grade 3 Science and English Test
16 pages
Concept Paper
No ratings yet
Concept Paper
3 pages
Account Statement: Penyata Akaun
No ratings yet
Account Statement: Penyata Akaun
2 pages
Sap Fi Bootcamp Training Day3
100% (2)
Sap Fi Bootcamp Training Day3
163 pages
Cloud Security Engineer Course Outline
No ratings yet
Cloud Security Engineer Course Outline
4 pages
Windows Client Setup Guide
No ratings yet
Windows Client Setup Guide
13 pages
Pblifestyle
No ratings yet
Pblifestyle
14 pages
2021.08.29 News Chapter 5 Chain
No ratings yet
2021.08.29 News Chapter 5 Chain
53 pages
Chromosome
No ratings yet
Chromosome
10 pages
Cape Physics Unit 2 Formula Sheet
No ratings yet
Cape Physics Unit 2 Formula Sheet
4 pages
The First Season Recap, Episode by Episode: Pilot
No ratings yet
The First Season Recap, Episode by Episode: Pilot
45 pages
CR-1010 2ND Basement Plan
No ratings yet
CR-1010 2ND Basement Plan
1 page
ImagePROGRAF TM Series Brochure 200
No ratings yet
ImagePROGRAF TM Series Brochure 200
4 pages
Whistleblower Solutions for Firms
No ratings yet
Whistleblower Solutions for Firms
4 pages
Medical Biller Practice Test, Medical Billing Practice Test
No ratings yet
Medical Biller Practice Test, Medical Billing Practice Test
7 pages
A Season in Hell - The Illuminations - Arthur Rimbaud - 2023 - Anna's Archive
No ratings yet
A Season in Hell - The Illuminations - Arthur Rimbaud - 2023 - Anna's Archive
193 pages

Convnets 2

Uploaded by

Convnets 2

Uploaded by

» Convolutional Layer

output w[1] ∗ a[0]

∗ To get a complete convolutional layer we pass the elements of

Some more notes:

∗ Pooling layers are used to reduce the size of matrices in tensor

∗ Use max-pooling with 2 × 2 block size and stride 2:

1. Partition input matrix into 2 × 2 blocks, stride of 2 means blocks

∗ E.g. Max-pooling with 3 × 3 block size and stride 1:

∗ But mostly use stride=block size → no overlap between blocks

∗ If input is output from a convolution layer, i.e. a tensor, need

∗ Uses strides to downsample the image.

# Load MNIST dataset

model.compile(loss=”categorical_crossentropy”, optimizer=’adam’, metrics=[”accuracy”])

∗ Note: use regularisation on FC-layers but usually not on convolutional layers.

∗ Again, note that use regularisation on FC-layers but usually not on

An alternative (but v similar) architecture:

∗ Use “same” padding in conv layers → output is same size as input.

∗ Achieves 98.9% accuracy on test data

You might also like