KEMBAR78
12.2 Computer Vision | PDF | Deep Learning | Computer Vision
0% found this document useful (0 votes)
155 views12 pages

12.2 Computer Vision

The document discusses applications of deep learning in computer vision. It describes how computer vision is well-suited for deep learning research due to vision being easy for humans but difficult for computers. Common computer vision tasks aimed at replicating human abilities include object recognition, detection, image synthesis. Preprocessing techniques for computer vision with deep learning include contrast normalization, dataset augmentation which increases training data through transformations.

Uploaded by

nikhilsinha789
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
155 views12 pages

12.2 Computer Vision

The document discusses applications of deep learning in computer vision. It describes how computer vision is well-suited for deep learning research due to vision being easy for humans but difficult for computers. Common computer vision tasks aimed at replicating human abilities include object recognition, detection, image synthesis. Preprocessing techniques for computer vision with deep learning include contrast normalization, dataset augmentation which increases training data through transformations.

Uploaded by

nikhilsinha789
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Deep Learning Srihari

Applications: Computer Vision

Sargur N. Srihari
srihari@cedar.buffalo.edu

1
Deep Learning Srihari

Topics in Applications
1. Large-Scale Deep Learning
2. Computer Vision
3. Speech Recognition
4. Natural Language Processing
5. Other Applications

2
Deep Learning Srihari

Topics in Computer Vision


• Overview
• Preprocessing
– Contrast Normalization
– Dataset Augmentation

3
Deep Learning Srihari

Computer Vision and Deep Learning


• Computer Vision is one of the most active
areas for deep learning research, since
– Vision is a task effortless for humans but difficult for
computers
• Standard benchmarks for deep learning
algorithms are:
– object recognition
– OCR

4
Deep Learning Srihari

Common tasks
• Small core of AI goals aimed at replicating
human abilities
– Object recognition
– Detection of some form
• Which object is present?
• Annotating an image with bounding boxes around each
object
• Transcribing a sequence of symbols from image
• Labeling each pixel with identity of object it belongs
– Image synthesis
• Because generative models are a guiding principle
behind deep learning, large body of work on synthesis5
Preprocessing
Deep Learning Srihari

• Some deep learning needs much


preprocessing
• Computer vision requires little preprocessing
– Pixel range
• Images should be standardized, so pixels lie in same
range [0,1], [-1,1], or [0,255] etc
– Picture size
• Some architectures need a standard size. So images may
need to be scaled
• May not be needed with convolutional models which
dynamically adjust size of pooling regions
– Data set augmentation 6
• Can be seen as a preprocessing step for training set
Deep Learning Srihari

Training with large data sets


• Large data sets (Imagenet) & models (Alexnet)
– No preprocessing
– Learns invariances

• Alexnet for Imagenet has one preprocessor


– Subtract mean across training examples of pixels
– Dataset: ILSVRC subset of ImageNet: 1000 images in each
of 1000 categories: 1.2m training, 50k validation, 150k testing
– Architecture: CNN with 5 conv layers, max-pool layers,
dropout layers, 3 fully connected layers.
– Performance: top 5 error rate= 15.4% next was 26.2% 7
Deep Learning Srihari

Contrast Normalization
• Image contrast can be safely removed
• Contrast refers to the magnitude of the
difference between bright and dark pixels
• In deep learning different definition
– Contrast = standard deviation of pixels
– For image with r rows and c columns, and RGB
image, contrast of entire image is
where

8
• When std dev is high, values differ more from mean
Deep Learning Srihari

Global Contrast Normalization


• Aims to prevent images from having varying
amounts of contrast
• Subtract mean from each image, then rescale it
so that std dev across pixels equals constant s
• Given an input image X, GCN produces an X’

• 𝜆 is a positive regularization term to bias the std


deviation, the denominator is constrained to be
at least 𝜀 9
Deep Learning Srihari

GCN maps examples onto sphere


• Raw input data may have any norm
• 𝜆=0 maps all nonzero examples onto sphere
• 𝜆>0 draws examples towards sphere but does
not discard variations in norm

10
Deep Learning Srihari

Local Contrast Normalization


• Contrast is normalized across each small
window rather than entire image

11
Deep Learning Srihari

Dataset Augmentation
• Increasing training set by adding modified
training examples
– with transformations that do not change the class
• Object recognition is helped because input may
be transformed with many geometric operations
– Classifiers benefit from random translations,
rotations, flips of the input
• In specialized vision applications:
– Perturbations of colors
– Nonlinear geometric transformations of input
12

You might also like