0% found this document useful (0 votes)

26 views19 pages

Lecture 19

The lecture discusses the challenges of image recognition, including variations in lighting, viewpoint, and occlusion, which complicate object identification. It highlights the use of convolutional neural networks (CNNs) like LeNet5, which utilize replicated feature detectors and pooling to improve recognition accuracy without requiring segmentation. The presentation also touches on the importance of prior knowledge and data augmentation in enhancing machine learning models for image recognition tasks.

Uploaded by

zapwix0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views19 pages

Lecture 19

Uploaded by

zapwix0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Learning from Big Data

Lecture 19: Image recognition and CNNs

Dr. Lloyd T. Elliott, Fall 2022

Why image recognition is difficult

'Typographic attack': pen and paper fool AI into thinking apple is an iPod

• Recognizing objects in real scene

• Variation in lighting and viewpoint
• De nition of objects
• Requires huge amounts of knowledge (even for segmentation and viewport /
lighting)
fi
State of the art
• ADOP: Approximate di erentiable one-pixel point rendering (University of
Erlangen-Nuremberg)

• https://www.youtube.com/watch?v=WJRyu1JUtVw
ff
Things that make it hard

• Segmentation: real scenes are cluttered with other objects:

• Hard to tell which pieces go together
• Parts of an object can be hidden or clipped (occlusion)
• Lighting: Intensities are as much determined by lighting as by nature of object
• Deformation: Wide variety of shapes have the same name
• A ordances: For many objects, function is more important than shape for
de nition
ff
fi
More things that make it hard to recognize objects

• Viewpoint: wide variety of viewpoints for the same object

• "Information hops between input dimensions" dimension hopping

• We don't see this for many types of structured data (medical for example)
Viewpoint invariance

• Each time we look at an object, we have a di erent viewpoint. Unlike other machine
learning tasks

• Humans are so good at viewpoint variation, it's hard to appreciate how di cult it is
• One of the main di culties in computer vision
• Typical approaches:
• Use redundant invariant features
• Bounding boxes
• Replicated features with pooling "convolutional neurons"
ffi
ff
ffi
Invariant feature approach

• Extract large, overlapping / redundant set of features invariant to

transformations (rotation, scaling, translation, shear, stretch)

• Example: centre / surround for visual eld

• Problem: features will overlap with objects that are not in foreground ("parts
of di erent objects")

• Put a box around objects

• Normalize within the box
• Choosing the box is di cult (chicken / egg problem)
ff
ffi
fi
Brute force normalization

• When training recognizer, use well-segmented upright images to t correct

box

• At test time, try all possible boxes in a range of position and scales
• This was used often in computer vision ~2015

fi
Convolutional neural nets

• LeNet 1990s
• Use many di erent copies of the same feature detector with di erent
positions

• A feature detector useful in one place in the

image is likely useful in other areas too

• When we learn, we keep the red arrows all

having the same weights as each other

Red connections all have the same weight

ff
ff
Convolutional neural nets

• Replication greatly reduces the number of free parameters to be learned

• In this example 27 -> 9 weights
• Make many maps, each one with replicates
of the same feature. Di erent maps learn to
detect di erent features.

• Each patch of the image can then be

represented by features of many di erent types

Red connections all have the same weight

ff
ff
ff
Backpropagation with weight constraints

• It's easy to modify the backpropagation algorithm to incorporate linear

constraints between the weights

• We compute gradients as usual, but we modify gradients so that they satisfy

the constraints
w1 = w2 ! w1 = w2
<latexit sha1_base64="LD7YG2MnmpYdYhnbm0b3qUuXbyQ=">AAACFnicbVDLSgMxFM3UV62vUZdugkVwY5kpRd0IRV24rGAf0BmGTJq2oZkHyR1LGfoVbvwVNy4UcSvu/BvTdhBtPXDh5Jx7yb3HjwVXYFlfRm5peWV1Lb9e2Njc2t4xd/caKkokZXUaiUi2fKKY4CGrAwfBWrFkJPAFa/qDq4nfvGdS8Si8g1HM3ID0Qt7llICWPPNk6Nn4Ag+9MnYk7/WBSBkNsXPNBBA8M38eZc8sWiVrCrxI7IwUUYaaZ346nYgmAQuBCqJU27ZicFMigVPBxgUnUSwmdEB6rK1pSAKm3HR61hgfaaWDu5HUFQKeqr8nUhIoNQp83RkQ6Kt5byL+57UT6J67KQ/jBFhIZx91E4EhwpOMcIdLRkGMNCFUcr0rpn0iCQWdZEGHYM+fvEga5ZJ9WqrcVorVyyyOPDpAh+gY2egMVdENqqE6ougBPaEX9Go8Gs/Gm/E+a80Z2cw++gPj4xu9bJ0/</latexit>

• This is done as follows:

• Compute @E @E
<latexit sha1_base64="5xbcRYBlUAx0rL1ZgOhEGzrbHNQ=">AAACKXicfVBbS8MwGE29znmr+uhLcAg+yGjHUB+HIvg4wV1gLSXN0i0sTUuSKqP07/jiX/FFQVFf/SOmW0HdxAOBwznnS/IdP2ZUKst6NxYWl5ZXVktr5fWNza1tc2e3LaNEYNLCEYtE10eSMMpJS1HFSDcWBIU+Ix1/dJH7nVsiJI34jRrHxA3RgNOAYqS05JkNJxAIp06MhKKIwcvsm995dnYM/w3UMs+sWFVrAjhP7IJUQIGmZz47/QgnIeEKMyRlz7Zi5ab5lZiRrOwkksQIj9CA9DTlKCTSTSebZvBQK30YREIfruBE/TmRolDKcejrZIjUUM56ufiX10tUcOamlMeJIhxPHwoSBlUE89pgnwqCFRtrgrCg+q8QD5FuRulyy7oEe3bledKuVe2Tav26XmmcF3WUwD44AEfABqegAa5AE7QABvfgEbyAV+PBeDLejI9pdMEoZvbALxifX8Rbp40=</latexit>

,
@w1 @w2

• Use update @E @E for w1 and w2

<latexit sha1_base64="y+uFVK9ycT8U4mZLwiEDg54HMuc=">AAACPHicfVDNS8MwHE3n15xfU49egkMQhNGOoR6HH+BxovuAtZQ0S7ewtA1J6hilf5gX/whvnrx4UMSrZ7OtoG7ig8DjvfdL8nseZ1Qq03wycguLS8sr+dXC2vrG5lZxe6cpo1hg0sARi0TbQ5IwGpKGooqRNhcEBR4jLW9wPvZbd0RIGoW3asSJE6BeSH2KkdKSW7yxLwhTCA6hzUXEVQRtXyCc2BwJRRGDl+k3H7pWCo/g/5FK6hZLZtmcAM4TKyMlkKHuFh/tboTjgIQKMyRlxzK5cpLxlZiRtGDHknCEB6hHOpqGKCDSSSbLp/BAK13oR0KfUMGJ+nMiQYGUo8DTyQCpvpz1xuJfXidW/qmT0JDHioR4+pAfM6g7GjcJu1QQrNhIE4QF1X+FuI90M0r3XdAlWLMrz5NmpWwdl6vX1VLtLKsjD/bAPjgEFjgBNXAF6qABMLgHz+AVvBkPxovxbnxMozkjm9kFv2B8fgEW766n</latexit>

w/ +
@w1 @w2
• We can thus force backpropagation to use replicated features
What does replicating the features achieve?

• Equivariant activities: the neural activities in the next layer are not invariant to
translation, but they are equivariant

• Representation changes by as much as image

• Invariant knowledge: if a feature can be detected in one location, it can be
detected in other locations too
Pooling the output of replicated feature detectors

• To get invariance in activity, we must pool the output of the convolutional

layer

• Average / maximum together neighbouring replicated detectors to give a

single output to the next level

• Reduces number of inputs to the next layer (means, we can learn more
features)

• Problem: after several levels of this pooling, we lose information about the
precise location of the object (that's ne for kilns for example)
fi
LeNet5
• Yann LeCun and collaborators developed the rst good recognizer for
handwritten digits using backpropagation and feedforward

• Many hidden layers, many maps of replicated units, pooling between layers.
Did not require segmentation

• Was deployed by USPS, ~10% of zip code reading in USA in early 2000s

fi
LeNet5 in tensorflow

• medium.com/@mgazar
model = keras.Sequential()
model.add(layers.Conv2D(filters=6, kernel_size=(3, 3), activation='relu', input_shape=(32,32,1)))
model.add(layers.AveragePooling2D())
model.add(layers.Conv2D(filters=16, kernel_size=(3, 3), activation='relu'))
model.add(layers.AveragePooling2D())
model.add(layers.Flatten())
model.add(layers.Dense(units=120, activation='relu'))
model.add(layers.Dense(units=84, activation='relu'))
model.add(layers.Dense(units=10, activation = 'softmax'))
Prior knowledge in machine learning

• LeNet5 prior knowledge:

• Connectivity
• Weight constraints
• Activation functions
• Less intrusive than hand engineering features, but still pushes the network
towards a particular way of solving the problem

• Alternative: use prior knowledge to create more training data: augment

training data with simulated data (Hofman 1993)
More tricks

• Data augmentation
• Subsample & transform training images (AugMix Hendrycks et al. 2019)
Thank you

Lectura 5
No ratings yet
Lectura 5
30 pages
Convolutional Neural Network
100% (1)
Convolutional Neural Network
78 pages
BMM 2018 - Deep Learning Tutorial
No ratings yet
BMM 2018 - Deep Learning Tutorial
47 pages
DL Tutorial NIPS2015 PDF
No ratings yet
DL Tutorial NIPS2015 PDF
133 pages
Deep 2
No ratings yet
Deep 2
57 pages
DeekshikaJadyada21 AP24LDS11
No ratings yet
DeekshikaJadyada21 AP24LDS11
5 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
155 pages
MATLAB CNN Toolbox for Researchers
No ratings yet
MATLAB CNN Toolbox for Researchers
59 pages
Harsha Thesis
No ratings yet
Harsha Thesis
62 pages
Matconvnet: Convolutional Neural Networks For Matlab
No ratings yet
Matconvnet: Convolutional Neural Networks For Matlab
55 pages
Deep Learning for Visual Experts
No ratings yet
Deep Learning for Visual Experts
58 pages
Mergeddv
No ratings yet
Mergeddv
2 pages
CNN Course: Build & Apply Networks
No ratings yet
CNN Course: Build & Apply Networks
95 pages
Week5 Computer Vision
No ratings yet
Week5 Computer Vision
58 pages
Ml@ok Questions
No ratings yet
Ml@ok Questions
16 pages
Lecture4 - Convnets For CV Slide
No ratings yet
Lecture4 - Convnets For CV Slide
65 pages
Introtodeeplearning MIT 6.S191
No ratings yet
Introtodeeplearning MIT 6.S191
36 pages
Bascis of AI - Module 2 - Complementary Study Material - 3
No ratings yet
Bascis of AI - Module 2 - Complementary Study Material - 3
3 pages
Lec6 RNN Attention Search
No ratings yet
Lec6 RNN Attention Search
62 pages
Computer Vision Essentials Guide
No ratings yet
Computer Vision Essentials Guide
28 pages
Military AI-Week 05-AI in Computer Vision
No ratings yet
Military AI-Week 05-AI in Computer Vision
65 pages
ch4 CNN
No ratings yet
ch4 CNN
35 pages
Deep Learning Notes (1) 2
No ratings yet
Deep Learning Notes (1) 2
54 pages
L10-DL Intro
No ratings yet
L10-DL Intro
25 pages
W11 Lecture ITS69204 Image Recognition
No ratings yet
W11 Lecture ITS69204 Image Recognition
44 pages
15691A04C8 MV Asnmt-2
No ratings yet
15691A04C8 MV Asnmt-2
7 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
8 pages
DL U3 Applications of Deep Learning To Computer Vision: Image Classification Object Detection
No ratings yet
DL U3 Applications of Deep Learning To Computer Vision: Image Classification Object Detection
15 pages
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
No ratings yet
Convolutional Neural Networks: CS 535 Deep Learning, Winter 2020 Fuxin Li
44 pages
The Little Book of Deep Learning François Fleuret Online Version
No ratings yet
The Little Book of Deep Learning François Fleuret Online Version
65 pages
Unit 4 Deep Learning For Computer Vision
No ratings yet
Unit 4 Deep Learning For Computer Vision
6 pages
Deep Learning & CNN Fundamentals
No ratings yet
Deep Learning & CNN Fundamentals
56 pages
Convolutional Neural PDF
No ratings yet
Convolutional Neural PDF
187 pages
Deep Learning
No ratings yet
Deep Learning
45 pages
Facial Recognition Using Deep Learning
No ratings yet
Facial Recognition Using Deep Learning
6 pages
SoS'25 Midterm - Report
No ratings yet
SoS'25 Midterm - Report
14 pages
M10 - Introduction To TensorFlow, Deep Learning and Application
No ratings yet
M10 - Introduction To TensorFlow, Deep Learning and Application
25 pages
Neural Networks and Visual Perception
No ratings yet
Neural Networks and Visual Perception
49 pages
A Review On Various Methodologies Used For Vehicle Classification, Helmet Detection and Number Plate Recognition
No ratings yet
A Review On Various Methodologies Used For Vehicle Classification, Helmet Detection and Number Plate Recognition
9 pages
Rec03 - Deep Architectures
No ratings yet
Rec03 - Deep Architectures
65 pages
COMP3411 Week 7 - Computer Vision
No ratings yet
COMP3411 Week 7 - Computer Vision
58 pages
Aiml Ece Unit-5
No ratings yet
Aiml Ece Unit-5
48 pages
Helmet and Vehicle License Plate Detection System
No ratings yet
Helmet and Vehicle License Plate Detection System
26 pages
Chapitre 8 2024
No ratings yet
Chapitre 8 2024
231 pages
Traffic Sign Classification: Mezzi Houssem
No ratings yet
Traffic Sign Classification: Mezzi Houssem
36 pages
Aiml Ece Unit-5
No ratings yet
Aiml Ece Unit-5
48 pages
Unit 1
No ratings yet
Unit 1
17 pages
Deep Learning - 11 - 12
No ratings yet
Deep Learning - 11 - 12
48 pages
8 Deep Learning CNN
No ratings yet
8 Deep Learning CNN
63 pages
CSC2535: 2013 Advanced Machine Learning Lecture 8b: Image Retrieval Using Multilayer Neural Networks
No ratings yet
CSC2535: 2013 Advanced Machine Learning Lecture 8b: Image Retrieval Using Multilayer Neural Networks
34 pages
Facial Emotion Detection
No ratings yet
Facial Emotion Detection
10 pages
6-DeepVisualLearning L6
No ratings yet
6-DeepVisualLearning L6
82 pages
Deep Convolutional Neural Networks For Image Classification: Many Slides From Rob Fergus (NYU and Facebook)
No ratings yet
Deep Convolutional Neural Networks For Image Classification: Many Slides From Rob Fergus (NYU and Facebook)
55 pages
Sagar Paper
No ratings yet
Sagar Paper
4 pages
Object Recognition Techniques Guide
No ratings yet
Object Recognition Techniques Guide
34 pages
Convolutional Nets
No ratings yet
Convolutional Nets
41 pages
Convolutional Neural Network Architecture - CNN Architecture
No ratings yet
Convolutional Neural Network Architecture - CNN Architecture
13 pages
Applied Sciences: Image Retrieval Method Based On Image Feature Fusion and Discrete Cosine Transform
No ratings yet
Applied Sciences: Image Retrieval Method Based On Image Feature Fusion and Discrete Cosine Transform
28 pages
Package Spectral': R Topics Documented
No ratings yet
Package Spectral': R Topics Documented
35 pages
Session 3 CSP 2023 AP Daily Practice Sessions
No ratings yet
Session 3 CSP 2023 AP Daily Practice Sessions
6 pages
Exp of DSP
No ratings yet
Exp of DSP
148 pages
BTech Digital Image Processing Exam
No ratings yet
BTech Digital Image Processing Exam
1 page
MapReduce: Counting & Shortest Paths
No ratings yet
MapReduce: Counting & Shortest Paths
50 pages
Performance Analysis of Algorithms
100% (2)
Performance Analysis of Algorithms
19 pages
AMGMI
No ratings yet
AMGMI
5 pages
CFD Discretization Techniques
No ratings yet
CFD Discretization Techniques
29 pages
Introduction To Machine Learning-1
No ratings yet
Introduction To Machine Learning-1
28 pages
AD3351 DAA Lab Manual
No ratings yet
AD3351 DAA Lab Manual
47 pages
Assignment 3,4,5-1
No ratings yet
Assignment 3,4,5-1
4 pages
Collaborative Clustering for RSISC
No ratings yet
Collaborative Clustering for RSISC
11 pages
Computational Methods Form
No ratings yet
Computational Methods Form
1 page
Ch-2 - Module1 - Signal Processing Elements - Final
No ratings yet
Ch-2 - Module1 - Signal Processing Elements - Final
34 pages
A Comprehensive Review On NSGA-II For Multi-Objective Combinatorial Optimization Problems
No ratings yet
A Comprehensive Review On NSGA-II For Multi-Objective Combinatorial Optimization Problems
35 pages
Bms Institute of Technology Department of Mca Sub Code - 16mca38 Algorithms Laboratory Viva Questions
No ratings yet
Bms Institute of Technology Department of Mca Sub Code - 16mca38 Algorithms Laboratory Viva Questions
13 pages
ECTE301: Digital Signal Processing: Revision of Part A
No ratings yet
ECTE301: Digital Signal Processing: Revision of Part A
20 pages
Skill: Numerical Ability::Worksheet Number:63
No ratings yet
Skill: Numerical Ability::Worksheet Number:63
2 pages
Cascaded Integrator Comb Digital Filters Paper Hogenauer
No ratings yet
Cascaded Integrator Comb Digital Filters Paper Hogenauer
8 pages
Final REG-C1 - Document
No ratings yet
Final REG-C1 - Document
22 pages
Ees 400 - Topic Four - Multivariate Regression Analysis
No ratings yet
Ees 400 - Topic Four - Multivariate Regression Analysis
9 pages
No 5
No ratings yet
No 5
6 pages
DL - Pre - Insem Exam Question Paper (2025 - 2026)
No ratings yet
DL - Pre - Insem Exam Question Paper (2025 - 2026)
1 page
2 1graph
No ratings yet
2 1graph
70 pages
Maths QB
No ratings yet
Maths QB
6 pages
LIST OF QUESTION For PPS PRACTICAL EXAMINATION
No ratings yet
LIST OF QUESTION For PPS PRACTICAL EXAMINATION
2 pages
IE2108 2023-2024 Semester 2
No ratings yet
IE2108 2023-2024 Semester 2
4 pages
Normalization With Decimal Scaling in Data Mining - Examples Data Mining
No ratings yet
Normalization With Decimal Scaling in Data Mining - Examples Data Mining
4 pages