KEMBAR78
498 FA2019 Lecture01 | PDF | Deep Learning | Computer Vision
0% found this document useful (0 votes)
273 views61 pages

498 FA2019 Lecture01

The document provides an overview of a lecture on deep learning for computer vision. It discusses the history of computer vision from early work in the 1950s and 1960s to the development of deep learning techniques. The lecture agenda is also presented.

Uploaded by

S Mahapatra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
273 views61 pages

498 FA2019 Lecture01

The document provides an overview of a lecture on deep learning for computer vision. It discusses the history of computer vision from early work in the 1950s and 1960s to the development of deep learning techniques. The lecture agenda is also presented.

Uploaded by

S Mahapatra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

EECS 498-007 / 598-005

Deep Learning for Computer Vision


Lecture 1: Introduction

Justin Johnson Lecture 1 - 1 September 4, 2019


Deep Learning for Computer Vision

Justin Johnson Lecture 1 - 2 September 4, 2019


Deep Learning for Computer Vision

Building artificial systems


that process, perceive, and
reason about visual data

Justin Johnson Lecture 1 - 3 September 4, 2019


Computer Vision is everywhere!
Left to right:
Image by Roger H Goun is licensed
under CC BY 2.0
Image is CC0 1.0 public domain
Image is CC0 1.0 public domain
Image is CC0 1.0 public domain

Left to right:
Image is free to use
Image is CC0 1.0 public domain
Image by NASA is licensed
under CC BY 2.0
Image is CC0 1.0 public domain

Bottom row, left to right


Image is CC0 1.0 public domain
Image by Derek Keats is
licensed under CC BY 2.0;
changes made
Image is public domain
Image is licensed under CC-BY
2.0; changes made

Justin Johnson Lecture 1 - 4 September 4, 2019


Deep Learning for Computer Vision

Building artificial systems that


learn from data and experience

Justin Johnson Lecture 1 - 5 September 4, 2019


Deep Learning for Computer Vision

Hierarchical learning algorithms


with many “layers”, (very) loosely
inspired by the brain

Justin Johnson Lecture 1 - 6 September 4, 2019


Artificial Intelligence

Justin Johnson Lecture 1 - 7 September 4, 2019


Artificial Intelligence

Machine Learning

Computer
Vision

Justin Johnson Lecture 1 - 8 September 4, 2019


Artificial Intelligence

Machine Learning

Computer Deep
Vision Learning

Justin Johnson Lecture 1 - 9 September 4, 2019


This class
Artificial Intelligence

Machine Learning

Computer Deep
Vision Learning

Justin Johnson Lecture 1 - 10 September 4, 2019


This class
Artificial Intelligence

Machine Learning

Computer Deep
Vision Learning

gu age
l L an
ra
Natu cessing
Pro

Justin Johnson Lecture 1 - 11 September 4, 2019


This class
Artificial Intelligence

Machine Learning

Computer Deep
Vision Learning

gu age
l L an e e ch
ra Sp ition
Natu cessing g n
Pro Re co

Justin Johnson Lecture 1 - 12 September 4, 2019


This class
Artificial Intelligence

Rob
oti cs Machine Learning

Computer Deep
Vision Learning

gu age
l L an e e ch
ra Sp ition
Natu cessing g n
Pro Re co

Justin Johnson Lecture 1 - 13 September 4, 2019


Today’s Agenda

• A brief history of computer vision and deep learning

• Course overview and logistics

Justin Johnson Lecture 1 - 14 September 4, 2019


Hubel and Wiesel, 1959
Measure
brain activity
Simple cells:
Response to light
orientation

Complex cells:
Response to light
orientation and movement

Hypercomplex cells:
response to movement
Cat image by CNX OpenStax is licensed with an end point
under CC BY 4.0; changes made

1959
Hubel & Wiesel
Response Stimulus
No response

Justin Johnson Lecture 1 - 15 September 4, 2019


Larry Roberts, 1963

(a) Original picture (b) Differentiated picture (c) Feature points selected

1959 1963
Hubel & Wiesel Roberts

Justin Johnson Lecture 1 - 16 September 4, 2019


1959 1963
Hubel & Wiesel Roberts

Justin Johnson Lecture 1 - 17 September 4, 2019


2 ½-D sketch 3-D model
Input image Edge image

This image is CC0 1.0 public domain This image is CC0 1.0 public domain

Input Primal 2 ½-D 3-D Model


Image Sketch Sketch Representation

Zero crossings, Local surface 3-D models


blobs, edges, orientation and hierarchically
Perceived bars, ends, discontinuities in organized in
intensities virtual lines, depth and in terms of surface
groups, curves surface and volumetric
boundaries orientation primitives
1959 1963 1970s
Hubel & Wiesel Roberts David Marr

Stages of Visual Representation, David Marr, 1970s

Justin Johnson Lecture 1 - 18 September 4, 2019


Recognition via Parts (1970s)

Generalized Cylinders, Pictorial Structures,


Brooks and Binford, 1979 Fischler and Elshlager, 1973

1959 1963 1970s 1979


Hubel & Wiesel Roberts David Marr Gen. Cylinders

Justin Johnson Lecture 1 - 19 September 4, 2019


Recognition via Edge Detection (1980s)

1959 1963 1970s 1979 1986 John Canny, 1986


Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny David Lowe, 1987

Image is CC0 1.0 public domain

Justin Johnson Lecture 1 - 20 September 4, 2019


Recognition via Grouping (1990s)

1959 1963 1970s 1979 1986 1997


Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts
Normalized Cuts, Shi and Malik, 1997
AI Winter

Left Image is CC BY 3.0 Middl Image is public domain Right Image is CC-BY 2.0; changes made

Justin Johnson Lecture 1 - 21 September 4, 2019


Recognition via Matching (2000s)

Image is public domain


Image is public domain

1959 1963 1970s 1979 1986 1997 1999 SIFT, David


Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT
Lowe, 1999
AI Winter

Justin Johnson Lecture 1 - 22 September 4, 2019


Face Detection

Viola and Jones, 2001

One of the first successful


applications of machine
learning to vision

1959 1963 1970s 1979 1986 1997 1999 2001


Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J

AI Winter

Justin Johnson Lecture 1 - 23 September 4, 2019


PASCAL Visual Object Challenge
Image is CC0 1.0 public domain

Train
Person

Airplane

Image is CC0 1.0 public domain

1959 1963 1970s 1979 1986 1997 1999 2001 2001


Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL

AI Winter

Justin Johnson Lecture 1 - 24 September 4, 2019


Output:
The Image Classification Challenge: Scale
T-shirt
1,000 object classes Steel drum
1,431,167 images Drumstick
Mud turtle

Deng et al, 2009


Russakovsky et al. IJCV 2015

1959 1963 1970s 1979 1986 1997 1999 2001 2001 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet

AI Winter

Justin Johnson Lecture 1 - 25 September 4, 2019


Enter Deep Learning

1959 1963 1970s 1979 1986 1997 1999 2001 2001 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet

AI Winter
2012
AlexNet

Justin Johnson Lecture 1 - 26 September 4, 2019


AlexNet: Deep Learning Goes Mainstream

Krizhevsky, Sutskever, and Hinton, NeurIPS 2012

1959 1963 1970s 1979 1986 1997 1999 2001 2001 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet

AI Winter
2012
AlexNet

Justin Johnson Lecture 1 - 27 September 4, 2019


Perceptron
One of the earliest algorithms that could learn from data

Implemented in hardware! Weights stored in potentiometers,


updated with electric motors during learning

Connected to a camera that used 20x20 cadmium sulfide


photocells to make a 400-pixel image

Could learn to recognize letters of the alphabet

Today we would recognize it as a linear classifier


Frank Rosenblatt, ~1957
1959 1963 1970s 1979 1986 1997 1999 2001 2001 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet

AI Winter
1958 2012
Perceptron AlexNet

Justin Johnson Lecture 1 - 28 September 4, 2019


Minsky and Papert, 1969
X Y F(x,y) y

0 0 0
0 1 1
1 0 1
1 1 0 x

Showed that Perceptrons could not learn the XOR function


Caused a lot of disillusionment in the field

1959 1963 1970s 1979 1986 1997 1999 2001 2001 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet

AI Winter
1958 1969 2012
Perceptron Minsky & Papert AlexNet

Justin Johnson Lecture 1 - 29 September 4, 2019


Neocognitron: Fukushima, 1980

Computational model the visual system,


directly inspired by Hubel and Wiesel’s
hierarchy of complex and simple cells

Interleaved simple cells (convolution)


and complex cells (pooling)

No practical training algorithm


1959 1963 1970s 1979 1986 1997 1999 2001 2001 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet

AI Winter
1958 1969 1980 2012
Perceptron Minsky & Papert Neocognitron AlexNet

Justin Johnson Lecture 1 - 30 September 4, 2019


Neocognitron: Fukushima, 1980

Looks a lot like AlexNet


more than 32 years later!

1959 1963 1970s 1979 1986 1997 1999 2001 2001 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet

AI Winter
1958 1969 1980 2012
Perceptron Minsky & Papert Neocognitron AlexNet

Justin Johnson Lecture 1 - 31 September 4, 2019


Backprop: Rumelhart, Hinton, and Williams, 1986

Introduced backpropagation
for computing gradients in
neural networks recognizable
math

Successfully trained
perceptrons with multiple
layers
Illustration of Rumelhart et al., 1986 by Lane McIntosh,
copyright CS231n 2017

1959 1963 1970s 1979 1986 1997 1999 2001 2001 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet

AI Winter
1958 1969 1980 1985 2012
Perceptron Minsky & Papert Neocognitron Backprop AlexNet

Justin Johnson Lecture 1 - 32 September 4, 2019


Convolutional Networks: LeCun et al, 1998

Applied backprop algorithm to a Neocognitron-like architecture


Learned to recognize handwritten digits
Was deployed in a commercial system by NEC, processed handwritten checks
Very similar to our modern convolutional networks!

1959 1963 1970s 1979 1986 1997 1999 2001 2001 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet

AI Winter
1958 1969 1980 1985 1998 2012
Perceptron Minsky & Papert Neocognitron Backprop LeNet AlexNet

Justin Johnson Lecture 1 - 33 September 4, 2019


2000s: “Deep Learning”
People tried to train neural networks that
were deeper and deeper

Not a mainstream research topic at this time

Hinton and Salakhutdinov, 2006


Bengio et al, 2007
Lee et al, 2009
Glorot and Bengio, 2010

1959 1963 1970s 1979 1986 1997 1999 2001 2001 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet

AI Winter
1958 1969 1980 1985 1998 2006 2012
Perceptron Minsky & Papert Neocognitron Backprop LeNet Deep Learning AlexNet

Justin Johnson Lecture 1 - 34 September 4, 2019


2012 to Present: Deep Learning Explosion

Google Trends: “Deep Learning” Publications at top Computer Vision conference


1959 1963 1970s 1979 1986 1997 1999 2001 2001 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet

AI Winter
1958 1969 1980 1985 1998 2006 2012
Perceptron Minsky & Papert Neocognitron Backprop LeNet Deep Learning AlexNet

Justin Johnson Lecture 1 - 35 September 4, 2019


2012 to Present: ConvNets are everywhere
Image Classification Image Retrieval

Figures copyright Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, 2012. Reproduced with permission.

Justin Johnson Lecture 1 - 36 September 4, 2019


2012 to Present: ConvNets are everywhere
Object Detection Image Segmentation

Ren, He, Girshick, and Sun, 2015 Fabaret et al, 2012

Justin Johnson Lecture 1 - 37 September 4, 2019


2012 to Present: ConvNets are everywhere

Video Classification Activity Recognition

Simonyan et al, 2014

Justin Johnson Lecture 1 - 38 September 4, 2019


2012 to Present: ConvNets are everywhere
Pose Recognition (Toshev and Szegedy, 2014)

Playing Atari games (Guo et al, 2014)

Justin Johnson Lecture 1 - 39 September 4, 2019


2012 to Present: ConvNets are everywhere
Medical Imaging
Whale recognition

Levy et al, 2016 Figure reproduced with permission

Galaxy Classification

Dieleman et al, 2014


From left to right: public domain by NASA, usage permitted by
ESA/Hubble, public domain by NASA, and public domain. Kaggle Challenge This image by Christin Khan is in the public domain and
originally came from the U.S. NOAA.

Justin Johnson Lecture 1 - 40 September 4, 2019


2012 to Present: ConvNets are everywhere

A white teddy bear A man in a baseball A woman is holding


Image Captioning sitting in the grass uniform throwing a ball a cat in her hand
Vinyals et al, 2015
Karpathy and Fei-Fei, 2015

All images are CC0 Public domain:


https://pixabay.com/en/luggage-antique-cat-1643010/
https://pixabay.com/en/teddy-plush-bears-cute-teddy-bear-1623436/
https://pixabay.com/en/surf-wave-summer-sport-litoral-1668716/ A man riding a wave A cat sitting on a A woman standing on a
on top of a surfboard
https://pixabay.com/en/woman-female-model-portrait-adult-983967/
https://pixabay.com/en/handstand-lake-meditation-496008/
https://pixabay.com/en/baseball-player-shortstop-infield-1045263/
suitcase on the floor beach holding a surfboard
Captions generated by Justin Johnson using Neuraltalk2

Justin Johnson Lecture 1 - 41 September 4, 2019


Original image is CC0 public domain
Starry Night and Tree Roots by Van Gogh are in the public domain
Bokeh image is in the public domain Mordvinsev et al, 2015
Stylized images copyright Justin Johnson, 2017;
reproduced with permission
Gatys et al, 2016
Figures copyright Justin Johnson, 2015. Reproduced with permission. Generated using the Inceptionism approach from a blog post by Google Research.

Justin Johnson Lecture 1 - 42 September 4, 2019


Algorithms

Data

Computation

Justin Johnson Lecture 1 - 43 September 4, 2019


18
GigaFLOPs per Dollar
CPU GPU Deep Learning Explosion
16

14

12
GTX 1080 Ti

10

8
GeForce
6
GTX 580
(AlexNet)
4
GeForce
2
8800 GTX

0
1/2004 10/2006 7/2009 4/2012 12/2014 9/2017
Time

Justin Johnson Lecture 1 - 44 September 4, 2019


2018 Turing Award

Yoshua Bengio Geoffrey Hinton Yann LeCun


1959 1963 1970s 1979 1986 1997 1999 2001 2001 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet

AI Winter
1958 1969 1980 1985 1998 2006 2012 2018
Perceptron Minsky & Papert Neocognitron Backprop LeNet Deep Learning AlexNet Turing Award

Justin Johnson Lecture 1 - 45 September 4, 2019


Despite our success, computer
vision still has a long way to go…

Justin Johnson Lecture 1 - 46 September 4, 2019


This image is copyright- Example credit:
free United States
government work Andrej Karpathy

Justin Johnson Lecture 1 - 47 September 4, 2019


Computer Vision Technology
Can Better Our Lives

Outside border images, clockwise, starting from top left:


Image by Pop Culture Geek is licensed under CC BY 2.0; changes made
Image by the US Government is in the public domain Inside four images, clockwise, starting from top left:
Image by the US Government is in the public domain Image is CC0 1.0 public domain
Image by Glogger is licensed under CC BY-SA 3.0; changes made Image by Tucania is licensed under CC BY-SA 3.0; changes made
Image by Sylenius is licensed under CC BY 3.0; changes made Image by Intuitive Surgical, Inc. is licensed under CC BY-SA 3.0; changes made
Image by US Government is in the public domain Image by Oyundari Zorigtbaatar is licensed under CC BY-SA 4.0

Justin Johnson Lecture 1 - 48 September 4, 2019


Today’s Agenda

• A brief history of computer vision and deep learning

• Course overview and logistics

1959 1963 1970s 1979 1986 1997 1999 2001 2001 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet

AI Winter
1958 1969 1980 1985 1998 2006 2012 2018
Perceptron Minsky & Papert Neocognitron Backprop LeNet Deep Learning AlexNet Turing Award

Justin Johnson Lecture 1 - 49 September 4, 2019


Today’s Agenda

• A brief history of computer vision and deep learning

• Course overview and logistics

1959 1963 1970s 1979 1986 1997 1999 2001 2001 2009 2019
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet This class

AI Winter
1958 1969 1980 1985 1998 2006 2012 2018
Perceptron Minsky & Papert Neocognitron Backprop LeNet Deep Learning AlexNet Turing Award

Justin Johnson Lecture 1 - 50 September 4, 2019


Course Staff
Instructor Graduate Student Instructors

Justin Johnson Yunseok Jang Kibok Lee Luowei Zhao


Assistant Professor, CSE PhD student, CSE PhD student, CSE PhD student, RI

Video understanding, Robustness, Vision & Language


Generative models Generalization

Justin Johnson Lecture 1 - 51 September 4, 2019


How to contact us
• Course Website: https://web.eecs.umich.edu/~justincj/teaching/eecs498/
• Syllabus, schedule, assignments, slides, lecture videos, etc
• Piazza: https://piazza.com/class/k01uvwqmf8c4nb
• (Almost) all questions about the course should go here!
• We will also use Piazza to communicate with you
• Use private questions if you want to post code
• Canvas:
• For turning in homework assignments
• Google Calendar: For office hours (starting next week)
• Email: Only for sensitive, confidential issues

Justin Johnson Lecture 1 - 52 September 4, 2019


Optional Textbook

• Deep Learning by Goodfellow,


Bengio, and Courville
• Free online

Justin Johnson Lecture 1 - 53 September 4, 2019


Course Content and Grading

• 6 programming assignments (10% each)


• Homework assignments will use Python, PyTorch, and Google Colab
• Midterm Exam (20%)
• Final Exam (20%)
• Late policy
• 3 free late days to use on assignments
• Once free late days are exhausted, 25% penalty per day

Justin Johnson Lecture 1 - 54 September 4, 2019


Course Content and Grading

• 6 programming assignments (10% each)


• Homework assignments will use Python, PyTorch, and Google Colab
• Midterm Exam (20%)
• Final Exam (20%)
• Late policy
• 3 free late days to use on assignments
• Once free late days are exhausted, 25% penalty per day

Justin Johnson Lecture 1 - 55 September 4, 2019


Collaboration Policy

• Rule 1: Don’t look at solutions or code that are not your


own; everything you submit should be your own work
• Rule 2: Don’t share your solution code with others; however
discussing ideas or general strategies is fine and encouraged
• Rule 3: Indicate in your submissions anyone you worked with
• Turning in something late / incomplete is better than
violating the honor code

Justin Johnson Lecture 1 - 56 September 4, 2019


Course Philosophy
• Thorough and Detailed.
• This not “Learn PyTorch in 90 days”, nor “Deep Learning in 10 lines of code”
• Understand how to write from scratch, debug, and train convolutional and
other types of deep neural networks
• We prefer to write from scratch, rather than rely on existing implementations
• Practical
• Focus on practical techniques for training and debugging neural networks
• Will use state-of-the-art software tools like PyTorch and TensorFlow
• State of the art
• Most material we cover is research published in the last 5 years

Justin Johnson Lecture 1 - 57 September 4, 2019


Course Philosophy
• Will also cover some fun topics:
• Image captioning (with RNNs)
• DeepDream, Artistic Style Transfer

Justin Johnson Lecture 1 - 58 September 4, 2019


Course Structure
• First half: Fundamentals
• Details of how to implement and train different types of networks
• Fully-connected networks, convolutional networks, recurrent networks
• How to train and debug, very detailed
• Second half: Applications and “Researchy” topics
• Object detection, image segmentation, 3D vision, videos
• Attention, Transformers
• Vision and Language
• Generative models: GANs, VAEs, etc
• Less detailed: provide overview and references, but skip some details

Justin Johnson Lecture 1 - 59 September 4, 2019


First homework assignment

• Will be released over the weekend


• Due one week after release
• Monday’s lecture will be enough to complete it

Justin Johnson Lecture 1 - 60 September 4, 2019


Next time: Image Classification

Justin Johnson Lecture 1 - 61 September 4, 2019

You might also like