0% found this document useful (0 votes)

49 views79 pages

Computer Vision Course Overview

Uploaded by

jinyaoz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views79 pages

Computer Vision Course Overview

Uploaded by

jinyaoz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 79

Computer Vision!

CS-E4850, 5 study credits!

!
Juho Kannala!
Aalto University!
Plan for today!

• Background!
• What is computer vision?!
• Why to study computer vision?!

• Overview of the course!

• Lecture 1: Image formation!

Credits: Material for slides borrowed from Victor Prisacariu, Andrew Zisserman, Esa Rahtu, James Hays, !
Derek Hoiem, Svetlana Lazebnik, Steve Seitz, David Forsyth, and others!
Course personnel!

!
• Lecturer: !
Juho Kannala
juho.kannala@aalto.fi!

• Main course assistant:!

Xiaotian Li
firstname.lastname@aalto.fi !
A few words about me!

Juho Kannala!
Assistant Professor of Computer vision!
• PhD, University of Oulu 2010!

• Professor at Aalto since 2016!

• Working with computer vision since 2000 !

• Recent projects and other info available on my homepage: https://users.aalto.fi/~kannalj1/ !

Motivation - what is computer vision?!
Make computers understand images!

• What kind of scene?!

• Where are the cars?!
• How far are the buildings?!
• Where are the cars going?!
• …..!
Many data modalities!

• 2D or 3D still images !
• Video frames!
• X-ray !
• Ultra-sound!
• Microscope!
• ….!
What kind of information can be extracted?!

Semantic information! Geometric information!

What do we have here?!

… seems pretty easy…

Wrong! Very hard big data problem…!

• Hardware perspective:!
• RGB stereo images with 30 frames per second -> 100s MB/s data stream.!
• Non-trivial processing per each byte.!
• Massive image collections.!

• Mathematical perspective!
• Information is highly implicit or lost by perspective projection!
• 2D -> 3D mapping is ill-posed and ill-conditioned -> need to use constraints!
Wrong! Very hard big data problem…!

• Artificial intelligence perspective!

• Images have uneven information content !
• Computational visual semantics is hard (what does visual stuff mean exactly?)!
• If we have limited time, what is the important visual stuff right now?!

Still a massive challenge - if we want genuine autonomy.!

Natural vision !

• Humans see effortlessly!

Natural vision!

• Humans see effortlessly, but… it is very hard work for our brains!!
• There are billions of neurons in human brain!
• Years of evolution generated hardwired priors.!

So why bother?
What are the advantages?
Why computer vision matters?!

• Engineering point of view - Computer Vision helps to

solve many practical problems: business potential!
• Scientific point of view - Human kind of visual system is
one of the grand challenges of Artificial Intelligence (AI)!
• AI itself is a grand challenge of computing !
Why computer vision matters?!

• Safety!
• Health!
• Security!
• Fun!
• Access!
• ….!
Computer vision is already here!

• You are surrounded by !

devices using computer vision!
• Imagine what can be done !
with already installed cameras!!
Motivation - Success stories!
Recognizing “simple” patterns!
Face recognition!
Object detection and recognition!
Reconstruction: 3D from photo collections!

The Visual Turing test for Scene Reconstruction,!

Shan, Adams, Curless, Furukawa, Seitz, in 3DV 2013. YouTube video.!
A recent commercial 3D reconstruction system!

YouTube!
Robotics!

NASA’s Mars Rover! Robocup!

See “Computer Vision on Mars”! See www.robocup.org !

STAIRS at Stanford!
Saxena et al. 2008 !
Self-driving cars (Nvidia @ CES 2016)!
Visual odometry and SLAM!
Augmented Reality (AR) and Virtual Reality (VR)!
Image generation!

A style-based generator architecture for generative adversarial networks. Karras, Laine, Aila. CVPR 2019.!
Current state of the affairs!

• Many of the previous examples are less than 5 years old!!

• Many new applications to appear in the next 5 years!
• Strong open source culture!
• Many recent state-of-the-art methods are freely available!
• See papers from top conferences like CVPR, ECCV, ICCV, and NeurIPS!
5160

Rapidly growing area!

2019
Attendees and submissions to IEEE Conference on !
Computer Vision and Pattern Recognition (CVPR)!
Rapidly growing area !

Ref. Google Scholar top publications.!

Rapidly growing area - substantial commercial interest!

CVPR 2018 sponsors!

Plenty of job opportunities!

• Companies are looking for computer vision and deep learning experts.!
• Big Internet players are investing heavily (Apple, Google, Facebook,
Microsoft, Baidu, Tencent, …) as well as car industry (Tesla, BMW,…)!
• Strong imaging ecosystem also in Finland!
Specifics of this course!
Course textbooks!

• Szeliski: Computer Vision!

• Full-copy freely available!

• Hartley & Zisserman: Multiple!

View Geometry in Computer Vision!
• Available as an e-book via library!

• Forsyth & Ponce: Computer Vision!

• Full-copy freely available!
What will you learn on this course?!

• Course content (numbers refer to chapters in Szeliski’s book,1st edition):!

• Image formation and processing (2, 3)!
• Feature detection and matching (4)!
• Feature based alignment and image stitching (6,9)!
• Optical flow and tracking (8)!
• Basics of image classification and convolutional neural networks!
• Object recognition and detection (14)!
• Structure from motion, stereo and 3D reconstruction (7, 11, 12)!
What will you NOT learn on this course?!

• Software packages!
• PyTorch, TensorFlow, Keras, Caffe, etc.!
• We have simple exercises with Python/Matlab though!

• In-depth deep learning!

• Tweaking architectures, loss functions, etc.!
• Note that there exists a separate deep learning course (CS-E4890) !

• All the bells and whistles in the state-of-the-art systems!

• We concentrate on the basic concepts (get them right and the rest is easier for you)!
Organization!

• Lectures on Mondays at 8-10 (12 lectures)!

• Exercises on Fridays at 12-14 (12 sessions)!
• The solutions of weekly homework assignments should be returned before the session!
• The solutions are presented in the session !

• Guidance available if needed!

• Slack and guidance sessions on Thursdays (see MyCourses)!

• Presence is not rewarded, only returned homework and exam counts!

Requirements!

• Get more than 0 points from at least 8 exercise rounds !

(i.e. solve at least 1 task from 8 different weekly rounds)!
• Pass the exam!
Hints!

• Doing homework takes time but is often a good way to learn in depth!
• Try to do more than the minimum - homework points are taken into
account in the grading (i.e. weighted exercise points are added to
exam points)!
• Note that the amount of work and bonus points varies a bit between
weeks - exercises are published early so that you can do them in
advance if needed!
Questions at this point?!
Lecture 1: Camera model!
Relevant reading!

• Chapters 2, 3, and 6 in [Hartley & Zisserman]!

• Comprehensive presentation of the core content!

• Chapter 2 in [Szeliski]!
• Broader overview of the image formation!
This is (a picture of) a cat!

Credits: Victor Prisacariau!

Cat lives in a 3D world!

The point X in world space projects to the point x in image space.!

Credits: Victor Prisacariau!
Going from X in 3D to x in 2D!

The output would be blurry if film just exposed to the cat.!

Pinhole camera!

All rays passing through a single point (center of projection)!

Pinhole camera!
Pinhole camera!
What happens in the projection?!

• Projection from 3D to 2D -> information is lost!

• What properties are preserved?!
• Straight lines!
• Incidence!

• What properties are not preserved?!

• Angles!
• Lengths!
Projective geometry - what is lost?!
Length is not preserved!
Angles are not preserved!
Straight lines are still straight!
Vanishing points and lines!

• Parallel lines in the world!

intersect at a “vanishing point”!
Constructing the vanishing point of a line!
Vanishing points and lines!

All parallel lines will have the same vanishing point.!

Homogenous coordinates!

• The projection x1 = fX1/x3 is non linear!!

• Can be made linear using
homogenous coordinates!
• Homogenous coordinates allow for
transforms to be concatenated easily!
Homogenous coordinates!

Conversion to homogenous coordinates!

Conversion from homogenous coordinates!

Invariance to scaling!

E.g. [1,2,3] is the same as [3,6,9] and both represent !

the same inhomogeneous point [0.33,0.66]. !
Basic geometry in homogenous coordinates!

• Line equation: ax+by+c=0!

!
• A pixel p in homogenous coordinates:!
!
• Line is given by cross product of two points!
!
• Intersection of two lines is given by cross !
product of the lines!
3D Euclidean transformation!

• Cat moves through 3D space!

• The movement of the nose can be !
described using an Euclidean Transform!
Building the 3D rotation matrix R!

• R can be build from various representations (Euler angles, quaternion,

angle-axis representation, latter ones recommended)!
• Euler angles represent the rotation using three parameters, one for
each axis:!
!
!
!
!
!
!
!
!
3D Euclidean transformation!

• Concatenation of successive transforms is a mess!!

Homogenous coordinates save the day!!

• Replace 3D points with homogenous versions!

• The Euclidean transform becomes!

• Transformation can now be concatenated by matrix multiplication!

More 3D-3D and 2D-2D transformations!

3
Examples of 2D-2D transforms!
Perspective transformation (3D-2D)!
Perspective using homogenous coordinates!
Perspective using homogenous coordinates!
Wait! Our setup has several assumptions!

• Camera at world origin!

• Camera aligned with world
coordinates!
• Ideal pinhole camera!
Removing the initial assumptions!

• It is useful to split the overall projection matrix into three parts:!

• A part that depends on the internals of the camera (intrinsic)!
• A vanilla projection matrix!
• An Euclidean transformation between the world and camera frames (extrinsic)!

• Assume first that the world is aligned with camera coordinates!

-> the extrinsic camera matrix is an identity!
More realistic setting - camera pose!

• Assume the camera is translated and rotated with respect to the world!
The camera pose!

• The non-ideal camera pose can be taken into account by first

rotating and translating points from world frame to the camera frame!
The intrinsic parameters!

• Transformation to pixel units from metric units !

• Describe the hardware properties of a real camera!
• The image plane might be skewed!
• The pixels might not be square!
Summary of steps from scene to image!

• Move the scene point (Xw,1)T into camera coordinate system by!
4x4 (extrinsic) Euclidean transformation:!
!
!
• Project into ideal camera via the vanilla perspective transformation!
!

• Map the ideal image into the real image using intrinsic matrix!
Camera projection matrix P!
Beyond pinholes: Radial distortion!

• Common in wide-angle lenses!

• Creates non-linear terms in projection! Original!

• Usually handled by solving non-linear!

terms and then correcting the image!

Corrected!
Things to remember!

• Pinhole camera model!

!
!
• Homogenous coordinates!
!
!
• Camera projection matrix!
The end!

Unit 1
No ratings yet
Unit 1
186 pages
Lec01 CT Intro
No ratings yet
Lec01 CT Intro
61 pages
CV #1 Course Introduction-1
No ratings yet
CV #1 Course Introduction-1
61 pages
EC-803 Computer Vision: Lecture-1
No ratings yet
EC-803 Computer Vision: Lecture-1
43 pages
1 Intro
No ratings yet
1 Intro
103 pages
Administrivia: CMPSCI 370: Introduction To Computer Vision
No ratings yet
Administrivia: CMPSCI 370: Introduction To Computer Vision
12 pages
Computer Vision ch1
No ratings yet
Computer Vision ch1
80 pages
Prerequisites: What Is Computer Vision? Vision For Measurement
No ratings yet
Prerequisites: What Is Computer Vision? Vision For Measurement
8 pages
Computer Vision 2011
100% (1)
Computer Vision 2011
103 pages
Cv2021-Lec1-Introduction 1600 PDF - Gdrive.vip
No ratings yet
Cv2021-Lec1-Introduction 1600 PDF - Gdrive.vip
61 pages
Computer Vision Course Notes 2018
No ratings yet
Computer Vision Course Notes 2018
2 pages
CS436 CS5310 EE513 L01 Introduction
No ratings yet
CS436 CS5310 EE513 L01 Introduction
54 pages
Intro to Computer Vision Course
No ratings yet
Intro to Computer Vision Course
76 pages
DL4CV Week01 Part01
No ratings yet
DL4CV Week01 Part01
35 pages
Computer Vision Introduction
No ratings yet
Computer Vision Introduction
42 pages
COMP3411 Week 7 - Computer Vision
No ratings yet
COMP3411 Week 7 - Computer Vision
58 pages
Practical Computer Vision With SimpleCV The Simple Way To Make Technology See Kurt Demaagd Online PDF
67% (3)
Practical Computer Vision With SimpleCV The Simple Way To Make Technology See Kurt Demaagd Online PDF
118 pages
CS7.505: Computer Vision: Spring 2022
No ratings yet
CS7.505: Computer Vision: Spring 2022
46 pages
01 Introduction To MachineVision
No ratings yet
01 Introduction To MachineVision
53 pages
Intro
No ratings yet
Intro
66 pages
CV Digital Notes
No ratings yet
CV Digital Notes
77 pages
Computer Vision for Beginners
No ratings yet
Computer Vision for Beginners
26 pages
Computer Vision Notes
No ratings yet
Computer Vision Notes
72 pages
CO Machine Vision
No ratings yet
CO Machine Vision
3 pages
Computer Vision
No ratings yet
Computer Vision
52 pages
Lecture 1 AI Summary
No ratings yet
Lecture 1 AI Summary
31 pages
Lecture 2 Handout
No ratings yet
Lecture 2 Handout
154 pages
CV Unit 1 Overview of Computer Vison and Application
No ratings yet
CV Unit 1 Overview of Computer Vison and Application
51 pages
Lec00 Intro For Web
No ratings yet
Lec00 Intro For Web
81 pages
Lec 1
No ratings yet
Lec 1
51 pages
CompVisNotes PDF
No ratings yet
CompVisNotes PDF
115 pages
01 Introduction
No ratings yet
01 Introduction
19 pages
3D Vision Tutorial for Beginners
No ratings yet
3D Vision Tutorial for Beginners
153 pages
Lec01 Intro
No ratings yet
Lec01 Intro
61 pages
Computer VISION - 1
No ratings yet
Computer VISION - 1
21 pages
OpenCV Computer Vision Lecture
100% (5)
OpenCV Computer Vision Lecture
137 pages
LectureNotes PDF
No ratings yet
LectureNotes PDF
212 pages
Lecture 1-Introduction Fundamentals
No ratings yet
Lecture 1-Introduction Fundamentals
42 pages
Lec00 Intro For Web Highlighted
No ratings yet
Lec00 Intro For Web Highlighted
72 pages
Lec01 Intro
No ratings yet
Lec01 Intro
55 pages
Chapter 1 - Introduction To CV
No ratings yet
Chapter 1 - Introduction To CV
49 pages
Computer Vision for Tech Enthusiasts
No ratings yet
Computer Vision for Tech Enthusiasts
41 pages
Introduction to Data Science: (Khoa học dữ liệu)
No ratings yet
Introduction to Data Science: (Khoa học dữ liệu)
91 pages
18cse390t U1 s1 Slo1 Content
No ratings yet
18cse390t U1 s1 Slo1 Content
15 pages
Computer Vision - 01 Introduction
No ratings yet
Computer Vision - 01 Introduction
40 pages
Computer Vision 1731163352
No ratings yet
Computer Vision 1731163352
153 pages
CV SVD L01 P1 Intro
No ratings yet
CV SVD L01 P1 Intro
35 pages
4F12 Handout 1
No ratings yet
4F12 Handout 1
29 pages
CS231A - Computer Vision: Project Proposals
No ratings yet
CS231A - Computer Vision: Project Proposals
46 pages
CS5330 F22 Lectures
No ratings yet
CS5330 F22 Lectures
116 pages
Module 1
No ratings yet
Module 1
18 pages
Unit 4 Computer Vision Lecture Notes 1 4 Compress
No ratings yet
Unit 4 Computer Vision Lecture Notes 1 4 Compress
138 pages
Computer Vision and Virtual Reality: Motivation
No ratings yet
Computer Vision and Virtual Reality: Motivation
9 pages
CV s2015 Lec 1
No ratings yet
CV s2015 Lec 1
32 pages
1 Intro Visión Artificial
No ratings yet
1 Intro Visión Artificial
50 pages
Final Report
No ratings yet
Final Report
18 pages
SF Lund 2011 Part1
No ratings yet
SF Lund 2011 Part1
87 pages
Automated Penetration Testing with PENTESTGPT
No ratings yet
Automated Penetration Testing with PENTESTGPT
21 pages
Lecture 02
No ratings yet
Lecture 02
92 pages
Lecture 03
No ratings yet
Lecture 03
82 pages
Lecture 05
No ratings yet
Lecture 05
57 pages
Water Distribution Systems
100% (1)
Water Distribution Systems
49 pages
The Personnel Fluctuation
No ratings yet
The Personnel Fluctuation
12 pages
Contemporary Professional Nursing Final
No ratings yet
Contemporary Professional Nursing Final
17 pages
Semantic Structure & Translation Theory
0% (1)
Semantic Structure & Translation Theory
13 pages
Stuudy Case
No ratings yet
Stuudy Case
8 pages
Petroleum Basin Classifications
No ratings yet
Petroleum Basin Classifications
21 pages
Patchwork Text Winter
No ratings yet
Patchwork Text Winter
22 pages
ASTM-F1515-03-2008 - Standard Test Method For Measuring Light Stability of Resilient Flooring by Color Change
No ratings yet
ASTM-F1515-03-2008 - Standard Test Method For Measuring Light Stability of Resilient Flooring by Color Change
2 pages
Organisng 25
No ratings yet
Organisng 25
2 pages
Endemism: Definition, Types, and Examples
No ratings yet
Endemism: Definition, Types, and Examples
39 pages
Job Focused
No ratings yet
Job Focused
4 pages
Haas Service and Operator Manual Archive
100% (1)
Haas Service and Operator Manual Archive
75 pages
Component Description For Single Signal Acquisition and Actuation Module (SSAM) Control Unit
No ratings yet
Component Description For Single Signal Acquisition and Actuation Module (SSAM) Control Unit
1 page
Micrometer
No ratings yet
Micrometer
6 pages
Price of AIO Solar Street Light
No ratings yet
Price of AIO Solar Street Light
3 pages
IndividualTaskReport - ESPINOZA, JOAN
No ratings yet
IndividualTaskReport - ESPINOZA, JOAN
2 pages
Gwendolyn Brooks Study Guide
No ratings yet
Gwendolyn Brooks Study Guide
6 pages
Applied Economics
No ratings yet
Applied Economics
11 pages
Learning Strategies and Assessment Techniques As Applied To Edukasyong Pantahanan at Pangkabuhayan/ Technology and Livelihood Education
100% (1)
Learning Strategies and Assessment Techniques As Applied To Edukasyong Pantahanan at Pangkabuhayan/ Technology and Livelihood Education
20 pages
2002 Marathon S Service Manual
No ratings yet
2002 Marathon S Service Manual
22 pages
Hydraulic Handpump
No ratings yet
Hydraulic Handpump
1 page
5.1 Chemical Formulae, Equations, Calculations (1C) QP Part 2
No ratings yet
5.1 Chemical Formulae, Equations, Calculations (1C) QP Part 2
12 pages
Overview of Timeline Panel
No ratings yet
Overview of Timeline Panel
15 pages
WLP Q1 G11-Philosophy
No ratings yet
WLP Q1 G11-Philosophy
8 pages
t201 Visit Report
100% (1)
t201 Visit Report
16 pages
Ce302 Sare-Alfonso-Luis Assignment
No ratings yet
Ce302 Sare-Alfonso-Luis Assignment
2 pages
Geo Lab 2a
No ratings yet
Geo Lab 2a
7 pages
Apollo Test 4
No ratings yet
Apollo Test 4
21 pages
Assignment 3
No ratings yet
Assignment 3
2 pages
Ultrasonic Humidifier
No ratings yet
Ultrasonic Humidifier
3 pages

Computer Vision Course Overview

Uploaded by

Computer Vision Course Overview

Uploaded by

Computer Vision!

CS-E4850, 5 study credits!

• Overview of the course!

• Main course assistant:!

• Professor at Aalto since 2016!

• Working with computer vision since 2000 !

• Recent projects and other info available on my homepage: https://users.aalto.fi/~kannalj1/ !

• What kind of scene?!

Semantic information! Geometric information!

… seems pretty easy…

• Artificial intelligence perspective!

Still a massive challenge - if we want genuine autonomy.!

• Humans see effortlessly!

• Engineering point of view - Computer Vision helps to

• You are surrounded by !

The Visual Turing test for Scene Reconstruction,!

NASA’s Mars Rover! Robocup!

• Many of the previous examples are less than 5 years old!!

Rapidly growing area!

Ref. Google Scholar top publications.!

CVPR 2018 sponsors!

• Szeliski: Computer Vision!

• Hartley & Zisserman: Multiple!

• Forsyth & Ponce: Computer Vision!

• Course content (numbers refer to chapters in Szeliski’s book,1st edition):!

• In-depth deep learning!

• All the bells and whistles in the state-of-the-art systems!

• Lectures on Mondays at 8-10 (12 lectures)!

• Guidance available if needed!

• Presence is not rewarded, only returned homework and exam counts!

• Get more than 0 points from at least 8 exercise rounds !

• Chapters 2, 3, and 6 in [Hartley & Zisserman]!

Credits: Victor Prisacariau!

The point X in world space projects to the point x in image space.!

The output would be blurry if film just exposed to the cat.!

All rays passing through a single point (center of projection)!

• Projection from 3D to 2D -> information is lost!

• What properties are not preserved?!

• Parallel lines in the world!

All parallel lines will have the same vanishing point.!

• The projection x1 = fX1/x3 is non linear!!

Conversion to homogenous coordinates!

Conversion from homogenous coordinates!

E.g. [1,2,3] is the same as [3,6,9] and both represent !

• Line equation: ax+by+c=0!

• Cat moves through 3D space!

• R can be build from various representations (Euler angles, quaternion,

• Concatenation of successive transforms is a mess!!

• Replace 3D points with homogenous versions!

• The Euclidean transform becomes!

• Transformation can now be concatenated by matrix multiplication!

• Camera at world origin!

• It is useful to split the overall projection matrix into three parts:!

• Assume first that the world is aligned with camera coordinates!

• The non-ideal camera pose can be taken into account by first

• Transformation to pixel units from metric units !

• Common in wide-angle lenses!

• Usually handled by solving non-linear!

• Pinhole camera model!

You might also like