KEMBAR78
Camera Calibrations - Computer Vision | PDF
0% found this document useful (0 votes)
56 views82 pages

Camera Calibrations - Computer Vision

Uploaded by

TJ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views82 pages

Camera Calibrations - Computer Vision

Uploaded by

TJ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 82

Miniature faking

In close-up photo, the depth of field is limited.

http://en.wikipedia.org/wiki/File:Jodhpur_tilt_shift.jpg
Miniature faking
Miniature faking

http://en.wikipedia.org/wiki/File:Oregon_State_Beavers_Tilt-Shift_Miniature_Greg_Keene.jpg
Review
• Previous section:
– Model fitting and outlier rejection
Review: Hough transform

y m

x b

y m
3 5 3 3 2 2
3 7 11 10 4 3
2 3 1 4 5 2
2 1 0 1 3 3

x b
Slide from S. Savarese
Review: RANSAC


N I  14
Algorithm:
1. Sample (randomly) the number of points required to fit the model (#=2)
2. Solve for model parameters using samples
3. Score by the fraction of inliers within a preset threshold of the model

Repeat 1-3 until the best model is found with high confidence
Review: 2D image transformations

Szeliski 2.1
This section – multiple views
• Today – Camera Calibration. Intro to multiple views
and Stereo.
• Next Lecture – Epipolar Geometry and Fundamental
Matrix. Stereo Matching (if there is time).
• Both lectures are relevant for project 2.
Recap: Oriented and Translated Camera

R
jw

t kw
Ow
iw
Recap: Degrees of freedom

x  K R t X

5 6
 x
u   s u0   r11 r12 r13 tx   
wv    0 v0  r21 r22  y
 r23 ty  
     z 
1  0 0 1  r31 r32 r33 t z   
1 
This Lecture: How to calibrate the camera?

x  K R t X
X 
 su  * * * * Y 
 sv   * * * *  
    Z 
 s  * * * *  
1
What can we do with camera calibration?
How do we calibrate a camera?

880 214 312.747 309.140 30.086


43 203 305.796 311.649 30.356
270 197 307.694 312.358 30.418
886 347 310.149 307.186 29.298
745 302 311.937 310.105 29.216
943 128 311.202 307.572 30.682
476 590 307.106 306.876 28.660
419 214 309.317 312.490 30.230
317 335 307.435 310.151 29.318
783 521 308.253 306.300 28.881
235 427 306.650 309.301 28.905
665 429 308.069 306.831 29.189
655 362 309.671 308.834 29.029
427 333 308.255 309.955 29.267
412 415 307.546 308.613 28.963
746 351 311.036 309.206 28.913
434 415 307.518 308.175 29.069
525 234 309.950 311.262 29.990
716 308 312.160 310.772 29.080
602 187 311.988 312.709 30.514
World vs Camera coordinates
880 214 312.747 309.140 30.086
43 203 305.796 311.649 30.356
270 197 307.694 312.358 30.418
886 347 310.149 307.186 29.298
745 302 311.937 310.105 29.216
943 128 311.202 307.572 30.682
476 590 307.106 306.876 28.660
419 214 309.317 312.490 30.230
317 335 307.435 310.151 29.318
783 521 308.253 306.300 28.881
235 427 306.650 309.301 28.905
665 429 308.069 306.831 29.189
655 362 309.671 308.834 29.029
427 333 308.255 309.955 29.267
412 415 307.546 308.613 28.963
746 351 311.036 309.206 28.913
434 415 307.518 308.175 29.069
525 234 309.950 311.262 29.990
716 308 312.160 310.772 29.080
602 187 311.988 312.709 30.514
Slide Credit: Saverese

Projection matrix
R,T
jw

kw
Ow
iw

x  KR t  X
x: Image Coordinates: (u,v,1)
K: Intrinsic Matrix (3x3)
R: Rotation (3x3)
t: Translation (3x1)
X: World Coordinates: (X,Y,Z,1)
Projection matrix

Intrinsic Assumptions Extrinsic Assumptions


• Unit aspect ratio • No rotation
• Optical center at (0,0) • Camera at (0,0,0)
• No skew K

 x
u   f 0 0 0  
x  K I 0 X wv    0  y
f 0 0  
    z 
1  0 0 1 0  
1 
Slide Credit: Saverese
Remove assumption: known optical center

Intrinsic Assumptions Extrinsic Assumptions


• Unit aspect ratio • No rotation
• No skew • Camera at (0,0,0)

 x
u   f 0 u0 0  
x  K I 0 X wv    0  y
f v0 0  
    z 
1  0 0 1 0  
1 
Remove assumption: square pixels

Intrinsic Assumptions Extrinsic Assumptions


• No skew • No rotation
• Camera at (0,0,0)

 x
u   0 u0 0  
x  K I 0 X wv    0  v0 
0  
y
    z 
1  0 0 1 0  
1 
Remove assumption: non-skewed pixels

Intrinsic Assumptions Extrinsic Assumptions


• No rotation
• Camera at (0,0,0)

 x
u   s u0 0  
x  K I 0 X wv    0  y
 v0 0  
    z 
1  0 0 1 0  
1 

Note: different books use different notation for parameters


Oriented and Translated Camera

R
jw

t kw
Ow
iw
Allow camera translation

Intrinsic Assumptions Extrinsic Assumptions


• No rotation

x
u   s u0  1 0 0 t x   
x  K I t  X w v    0   
v0 0 1 0 t y  
y

     z 
1  0 0 1  0 0 1 t z   
1
Slide Credit: Saverese

3D Rotation of Points

Rotation around the coordinate axes, counter-clockwise:

1 0 0 
Rx ( )  0 cos   sin  
p’
0 sin  cos  
 cos  0 sin  

R y (  )   0 1 0 
p
y  sin  0 cos  
cos   sin  0
Rz ( )   sin  cos  0
z
 0 0 1
Allow camera rotation

x  K R t X

 x
u   s u0   r11 r12 r13 tx   
wv    0 v0  r21 r22  y
 r23 ty  
     z 
1  0 0 1  r31 r32 r33 t z   
1 
Degrees of freedom

x  K R t X

5 6
 x
u   s u0   r11 r12 r13 tx   
wv    0 v0  r21 r22  y
 r23 ty  
     z 
1  0 0 1  r31 r32 r33 t z   
1 
Beyond Pinholes: Radial Distortion
• Common in wide-angle lenses or
for special applications (e.g.,
security)
• Creates non-linear terms in
projection
• Usually handled by through solving
for distortion terms and then
correcting image to look like a
pinhole camera image

Corrected Barrel Distortion

Image from Martin Habbecke


How to calibrate the camera?

x  K R t X
X 
 su  * * * * Y 
 sv   * * * *  
    Z 
 s  * * * *  
1
Calibrating the Camera
Use a scene with known geometry
– Correspond image points to 3d points
– Get least squares solution (or non-linear solution)
Known 2d Known 3d
image coords locations

X 
 su   m11 m12 m13 m14  Y 
 sv   m m22 m23 m24   
   21  Z 
 s  m31 m32 m33 m34   
1

Unknown Camera Parameters


How do we calibrate a camera?
Known 2d Known 3d
image coords locations
880 214 312.747 309.140 30.086
43 203 305.796 311.649 30.356
270 197 307.694 312.358 30.418
886 347 310.149 307.186 29.298
745 302 311.937 310.105 29.216
943 128 311.202 307.572 30.682
476 590 307.106 306.876 28.660
419 214 309.317 312.490 30.230
317 335 307.435 310.151 29.318
783 521 308.253 306.300 28.881
235 427 306.650 309.301 28.905
665 429 308.069 306.831 29.189
655 362 309.671 308.834 29.029
427 333 308.255 309.955 29.267
412 415 307.546 308.613 28.963
746 351 311.036 309.206 28.913
434 415 307.518 308.175 29.069
525 234 309.950 311.262 29.990
716 308 312.160 310.772 29.080
602 187 311.988 312.709 30.514
Estimate of camera center

1.0486 -0.3645 1.5706 -0.1490 0.2598


-1.6851 -0.4004 -1.5282 0.9695 0.3802
-0.9437 -0.4200 -0.6821 1.2856 0.4078
1.0682 0.0699 0.4124 -1.0201 -0.0915
0.6077 -0.0771 1.2095 0.2812 -0.1280
1.2543 -0.6454 0.8819 -0.8481 0.5255
-0.2709 0.8635 -0.9442 -1.1583 -0.3759
-0.4571 -0.3645 0.0415 1.3445 0.3240
-0.7902 0.0307 -0.7975 0.3017 -0.0826
0.7318 0.6382 -0.4329 -1.4151 -0.2774
-1.0580 0.3312 -1.1475 -0.0772 -0.2667
0.3464 0.3377 -0.5149 -1.1784 -0.1401
0.3137 0.1189 0.1993 -0.2854 -0.2114
-0.4310 0.0242 -0.4320 0.2143 -0.1053
-0.4799 0.2920 -0.7481 -0.3840 -0.2408
0.6109 0.0830 0.8078 -0.1196 -0.2631
-0.4081 0.2920 -0.7605 -0.5792 -0.1936
-0.1109 -0.2992 0.3237 0.7970 0.2170
0.5129 -0.0575 1.3089 0.5786 -0.1887
0.1406 -0.4527 1.2323 1.4421 0.4506
Unknown Camera Parameters

X 
 su   m11 m12 m13 m14   Y  Known 3d
Known 2d   
sv  m m22 m23 m24   
image coords    21   Z  locations
 s  m31 m32 m33 m34   
1

su  m11 X  m12Y  m13 Z  m14


sv  m21 X  m22Y  m23 Z  m24
s  m31 X  m32Y  m33 Z  m34

( m31 X  m32Y  m33 Z  m34 )u  m11 X  m12Y  m13 Z  m14


( m31 X  m32Y  m33 Z  m34 )v  m21 X  m22Y  m23 Z  m24
m31uX  m32uY  m33uZ  m34u  m11 X  m12Y  m13 Z  m14
m31vX  m32 vY  m33vZ  m34 v  m21 X  m22Y  m23 Z  m24
Unknown Camera Parameters

X 
 su   m11 m12 m13 m14   Y  Known 3d
Known 2d   
sv  m m22 m23 m24   
image coords    21   Z  locations
 s  m31 m32 m33 m34   
1
m31uX  m32uY  m33uZ  m34u  m11 X  m12Y  m13 Z  m14
m31vX  m32 vY  m33vZ  m34 v  m21 X  m22Y  m23 Z  m24

0  m11 X  m12Y  m13 Z  m14  m31uX  m32uY  m33uZ  m34u


0  m21 X  m22Y  m23 Z  m24  m31vX  m32 vY  m33vZ  m34 v
Unknown Camera Parameters
X 
 su   m11 m12 m13 m14   
Known 2d     
Y Known 3d

sv  m21 m22 m23 m24
image coords      Z  locations
 s  m31 m32 m33 m34   
1
0  m11 X  m12Y  m13 Z  m14  m31uX  m32uY  m33uZ  m34u
0  m21 X  m22Y  m23 Z  m24  m31vX  m32 vY  m33vZ  m34 v

• Method 1 – homogeneous linear  m11 


m 
system. Solve for m’s entries using  12 
 m13 
linear least squares  
m
 X 1 Y1 Z1 1 0 0 0 0  u1 X 1  u1Y1  u1Z1  u1   14  0 In python, see
0 0 m 
0 0 X1 Y1 Z1 1  v1 X 1  v1Y1  v1Z1  v1   21  0
  m22   numpy.linalg.svd
      
  m   
 X n Yn Zn 1 0 0 0 0  un X n  unYn  un Z n  un   23  0
 0 0 m 
0 0 X n Yn Zn 1  vn X n  vnYn  vn Z n  vn   24  0
m
 31 
m32 
m 
 33 
m34 
Unknown Camera Parameters
X 
 su   m11 m12 m13 m14  Y 
Known 2d   
sv  m m22 m23 m24    Known 3d
image coords    21   Z  locations
 s  m31 m32 m33 m34   
1
• Method 2 – nonhomogeneous
linear system. Solve for m’s entries
using linear least squares
 m11 
Ax=b form m 
 12 
 m13 
 
 X 1 Y1 Z1 1 0 0 0 0  u1 X 1  u1Y1  u1Z1   m14   u1 
0 0 0 0 X1 Y1 Z1 1  v1 X 1  v1Y1  v1Z1   m21   v1  In python, see
    
   m22      numpy.linalg.lstsq
   
 X n Yn Zn 1 0 0 0 0  un X n  unYn  un Z n   m23  un 
 
 0 0 0 0 X n Yn Zn 1  vn X n  vnYn  vn Z n  m24   vn 
m 
 31 
m32 
m 
 33 
Calibration with linear method
• Advantages
– Easy to formulate and solve
– Provides initialization for non-linear methods

• Disadvantages
– Doesn’t directly give you human-interpretable camera parameters
– Doesn’t model radial distortion
– Can’t impose constraints, such as known focal length

• Non-linear methods are preferred


– Define error as difference between projected points and measured points
– Minimize error using Newton’s method or other non-linear optimization
Can we factorize M back to K [R | T]?
• Yes!
• You can use RQ factorization (note – not the more familiar QR
factorization). R (right diagonal) is K, and Q (orthogonal basis)
is R. T, the last column of [R | T], is inv(K) * last column of M.
– But you need to do a bit of post-processing to make sure that the
matrices are valid. See
http://ksimek.github.io/2012/08/14/decompose/
For project 3, we want the camera center
Estimate of camera center

1.0486 -0.3645 1.5706 -0.1490 0.2598


-1.6851 -0.4004 -1.5282 0.9695 0.3802
-0.9437 -0.4200 -0.6821 1.2856 0.4078
1.0682 0.0699 0.4124 -1.0201 -0.0915
0.6077 -0.0771 1.2095 0.2812 -0.1280
1.2543 -0.6454 0.8819 -0.8481 0.5255
-0.2709 0.8635 -0.9442 -1.1583 -0.3759
-0.4571 -0.3645 0.0415 1.3445 0.3240
-0.7902 0.0307 -0.7975 0.3017 -0.0826
0.7318 0.6382 -0.4329 -1.4151 -0.2774
-1.0580 0.3312 -1.1475 -0.0772 -0.2667
0.3464 0.3377 -0.5149 -1.1784 -0.1401
0.3137 0.1189 0.1993 -0.2854 -0.2114
-0.4310 0.0242 -0.4320 0.2143 -0.1053
-0.4799 0.2920 -0.7481 -0.3840 -0.2408
0.6109 0.0830 0.8078 -0.1196 -0.2631
-0.4081 0.2920 -0.7605 -0.5792 -0.1936
-0.1109 -0.2992 0.3237 0.7970 0.2170
0.5129 -0.0575 1.3089 0.5786 -0.1887
0.1406 -0.4527 1.2323 1.4421 0.4506
Oriented and Translated Camera

R
jw

t kw
Ow
iw
Recovering the camera center
x  K R t X This is not the camera
center -C. It is –RC
 x (because a point will
u   s u0   r11 r12 r13 tx    be rotated before tx, ty,
wv    0 v0  r21 r22  y
 r23 ty   and tz are added)
     z 
1  0 0 1  r31 r32 r33 t z   
This, m4, is K * t
1 
So K-1 m4 is t
X 
 su  * * * * Y 
So we need
-R-1 K-1 m4 to get C
 sv   * * * *  
    Z  Q is K * R. So we just
 s  * * * *  
need -Q-1 m4

1
Q
Estimate of camera center

1.0486 -0.3645 1.5706 -0.1490 0.2598


-1.6851 -0.4004 -1.5282 0.9695 0.3802
-0.9437 -0.4200 -0.6821 1.2856 0.4078
1.0682 0.0699 0.4124 -1.0201 -0.0915
0.6077 -0.0771 1.2095 0.2812 -0.1280
1.2543 -0.6454 0.8819 -0.8481 0.5255
-0.2709 0.8635 -0.9442 -1.1583 -0.3759
-0.4571 -0.3645 0.0415 1.3445 0.3240
-0.7902 0.0307 -0.7975 0.3017 -0.0826
0.7318 0.6382 -0.4329 -1.4151 -0.2774
-1.0580 0.3312 -1.1475 -0.0772 -0.2667
0.3464 0.3377 -0.5149 -1.1784 -0.1401
0.3137 0.1189 0.1993 -0.2854 -0.2114
-0.4310 0.0242 -0.4320 0.2143 -0.1053
-0.4799 0.2920 -0.7481 -0.3840 -0.2408
0.6109 0.0830 0.8078 -0.1196 -0.2631
-0.4081 0.2920 -0.7605 -0.5792 -0.1936
-0.1109 -0.2992 0.3237 0.7970 0.2170
0.5129 -0.0575 1.3089 0.5786 -0.1887
0.1406 -0.4527 1.2323 1.4421 0.4506
Stereo:
Intro

Computer Vision
James Hays

Slides by
Kristen Grauman
Multiple views

stereo vision
Lowe
structure from motion
Hartley and Zisserman
optical flow
Why multiple views?
• Structure and depth are inherently ambiguous from
single views.

Images from Lana Lazebnik


Why multiple views?
• Structure and depth are inherently ambiguous from
single views.

P1
P2

P1’=P2’

Optical center
• What cues help us to perceive 3d shape and depth?
Shading

[Figure from Prados & Faugeras 2006]


Focus/defocus

Images from
same point of
view, different
camera
parameters

3d shape / depth
estimates

[figs from H. Jin and P. Favaro, 2002]


Texture

[From A.M. Loh. The recovery of 3-D structure using visual texture patterns. PhD thesis]
Perspective effects

Image credit: S. Seitz


Motion

Figures from L. Zhang http://www.brainconnection.com/teasers/?main=illusion/motion-shape


Occlusion

Rene Magritt'e famous


painting Le Blanc-
Seing (literal translation:
"The Blank Signature")
roughly translates as
"free hand“. 1965
If stereo were critical for depth
perception, navigation, recognition,
etc., then this would be a problem
Stereo photography and stereo viewers
Take two pictures of the same subject from two slightly
different viewpoints and display so that each eye sees
only one of the images.

Image from fisher-price.com


Invented by Sir Charles Wheatstone, 1838
http://www.johnsonshawmuseum.org
http://www.johnsonshawmuseum.org
Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923
http://www.well.com/~jimg/stereo/stereo_list.html
http://www.well.com/~jimg/stereo/stereo_list.html
Autostereograms

Exploit disparity as
depth cue using single
image.
(Single image random
dot stereogram, Single
image stereogram)

Images from magiceye.com


Autostereograms

Images from magiceye.com


Parallax and our universe
Look again at that dot. That's here. That's
home. That's us. On it everyone you love,
everyone you know, everyone you ever heard
of, every human being who ever was, lived
out their lives. The aggregate of our joy and
suffering, thousands of confident religions,
ideologies, and economic doctrines, every
hunter and forager, every hero and coward,
every creator and destroyer of civilization,
every king and peasant, every young couple
in love, every mother and father, hopeful
child, inventor and explorer, every teacher of
morals, every corrupt politician, every
"superstar," every "supreme leader," every
saint and sinner in the history of our species
lived there--on a mote of dust suspended in a
sunbeam.

— Carl Sagan

https://en.wikipedia.org/wiki/Pale_Blue_Dot
Motion of Sun (yellow), Earth (blue), and
Mars (red). At left, Copernicus' heliocentric geocentric model (often
Nicolaus Copernicus motion. At right, traditional geocentric motion, exemplified specifically by the
including the retrograde motion of Mars. Ptolemaic system)

https://en.wikipedia.org/wiki/Heliocentrism
If the apparent motion of the planets is caused by parallax, why aren’t
we seeing parallax for stars?

It was one of Tycho Brahe's principal objections to Copernican


heliocentrism that for it to be compatible with the lack of observable
stellar parallax, there would have to be an enormous and unlikely
void between the orbit of Saturn and the eighth sphere (the fixed
stars).

The angles involved in these calculations are very small and thus
difficult to measure. The nearest star to the Sun (and also the star
with the largest parallax), Proxima Centauri, has a parallax of
0.7685 ± 0.0002 arcsec.[1] This angle is approximately that
subtended by an object 2 centimeters in diameter located 5.3
kilometers away. First reliable measurements of parallax were not
Tycho Brahe
made until 1838, by Friedrich Bessel

https://en.wikipedia.org/wiki/Stellar_parallax
https://en.wikipedia.org/wiki/Stellar_parallax
Stereo vision

Two cameras, simultaneous Single moving camera and


views static scene
Modern stereo depth estimation example
77
Multi-view geometry problems
• Stereo correspondence: Given a point in one of the
images, where could its corresponding points be in the
other images?

Camera 1 Camera 3
Camera 2
R1,t1 R3,t3
R2,t2 Slide credit:
Noah Snavely
Multi-view geometry problems
• Structure: Given projections of the same 3D point in two
or more images, compute the 3D coordinates of that point

Camera 1 Camera 3
Camera 2
R1,t1 R3,t3
R2,t2 Slide credit:
Noah Snavely
Multi-view geometry problems
• Motion: Given a set of corresponding points in two or
more images, compute the camera parameters

Camera 1
R1,t1 ? Camera 2
R2,t2 ? ?
Camera 3
R3,t3 Slide credit:
Noah Snavely
Estimating depth with stereo
• Stereo: shape from “motion” between two views
• We’ll need to consider:
• Info on camera pose (“calibration”)
• Image point correspondences

scene point

image plane
optical
center
Camera parameters
Camera
frame 2
Extrinsic parameters:
Camera frame 1  Camera frame 2

Intrinsic parameters:
Camera
Image coordinates relative to
frame 1 camera  Pixel coordinates

• Extrinsic params: rotation matrix and translation vector


• Intrinsic params: focal length, pixel sizes (mm), image center
point, radial distortion parameters

We’ll assume for now that these parameters are


given and fixed.
Geometry for a simple stereo system
• First, assuming parallel optical axes, known camera
parameters (i.e., calibrated cameras):
World
point

Depth of p
image point image point
(left) (right)

Focal
length
optical
optical center
center (right)
(left) baseline
Geometry for a simple stereo system
• Assume parallel optical axes, known camera parameters
(i.e., calibrated cameras). What is expression for Z?

Similar triangles (pl, P, pr) and


(Ol, P, Or):

T  xl  x r T

Z f Z

T TT T
Z  ZZf  fZf  f
xr xxrlr
- xxl lr  xl
disparity
Depth from disparity
image I(x,y) Disparity map D(x,y) image I´(x´,y´)

(x´,y´)=(x+D(x,y), y)

So if we could find the corresponding points in two images,


we could estimate relative depth…
Where do we need to search?

You might also like