KEMBAR78
Slide - 3DP - 10 - Structure From Motion | PDF | Computer Vision | Imaging
0% found this document useful (0 votes)
44 views38 pages

Slide - 3DP - 10 - Structure From Motion

The document discusses Structure from Motion (SfM), a technique for reconstructing 3D structures from 2D images taken from various viewpoints. It outlines the SfM pipeline, including correspondence search, geometric verification, initialization, pose recovery, triangulation, and bundle adjustment. Additionally, it mentions open-source frameworks like COLMAP and Bundler for implementing SfM.

Uploaded by

amir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views38 pages

Slide - 3DP - 10 - Structure From Motion

The document discusses Structure from Motion (SfM), a technique for reconstructing 3D structures from 2D images taken from various viewpoints. It outlines the SfM pipeline, including correspondence search, geometric verification, initialization, pose recovery, triangulation, and bundle adjustment. Additionally, it mentions open-source frameworks like COLMAP and Bundler for implementing SfM.

Uploaded by

amir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

3D Data Processing

Structure from Motion


Alberto Pretto
SfM: Problem Statement
SfM is the process of reconstructing the 3D
structure of a scene from its projections into a
series of images taken from different viewpoints.

2
Picture from Sameer Agarwal et al.
Structure from Motion
● Input: a set of images that frame the same scene
from different points of view. Each image should
have some field of view overlap with other
images
– Images can be totally unordered and taken with
differen cameras
– Camera calibration parameters may not be available
● Output: camera positions, camera calibrations
(optional) and sparse 3D scene structure (i.e., a
set of 3D points)
3
Incremental SfM Typical Pipeline

● Incremental SfM is a sequential processing pipeline


with an iterative reconstruction component
● In this discussion, we report some implementation
details from:
J. L. Schanberger and J. Frahm, "Structure-from-Motion Revisited,"
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016,
4
Correspondence Search
● For each image extract inavarinat
features with associated descriptors
(e.g., SIFT, SURF, ORB,...).
● Match features between every
image pair
– This is just a naive approach since an
exhaustive matching has
computational complexity:

– Many efficient approaches have


been introduced to improve matching
5
Geometric Verification
● Matching is based solely
on appearance, no
guarantee that matches
refer to the same 3D point
in the scene
● Use RANSAC +
projective/epipolar
geometry to verify the
matches
6
Geometric Verification
● Try all models:
– Essential matrix E (use an initial guess calibration)
– Fundamental matrix F (uncalibrated settings)
– Homography matrix H (planar scene or pure rotation)
● For each model, use RANSAC to estimate the best
estimate:
– Sample N minimal set of mathes (e.g., 5, 8, or 4) and find the
maximum number of inliers that support the model
● Select the best model E, F or H based on certain criteria:
its inliers define the inlier matches between an image
pair
7
Correspondences

3 3

5 6
5 8
Observations

3 3

4 5
5 9
Scene Graph
After geometric verification, we can build a
scene graph with images as nodes and
verified pairs of images as edges. It
provides:
– Correspondences between a pair of images.
– Number of correspondences per image.
– Number of observations in an image.
– The correspondence of an image observation
with all other images
10
Incremental Mapping

● Iterative, "try and error" process


● Reconstruction may never recover from a
bad initialization and also from a bad
image registration
11
Initialization
● Find a good initial pair ("seed"):
– Sort images such that images with
more correspondences are
preferred, i.e. they appear in the
front of the list.
– Collect images that are
connected to the first seed image
and have not been registered
before
– If not done before, perform
geometrical verification
12
Initialization
● Try to recover a rigid body
transformation from the selected
model (E, F, or H)
– In the uncalibrated case, this most likely
leads to a ill-defined reconstruction
● If failed, restart the initalization process
● If the motion is a pure rotation
estimated from H, restart the
initalization process
– No triangulation can be obtained from a R,t
pure rotation
13
Pose Recovery
When recovering
position from the
Essential or
Fundamental
matrix we get 4
possible solutions:
use cheirality
constraint to select
one

[Picture from Marc Pollefeys, Viktor Larsson] 14


Initial Triangulation
Using point correspondences
and the initial R,t, triangulate
point to obtain their 3D
coordinates

R,t 15
Triangulation
For a non rectilinear stereo rigs, no single
solution due to noise

X?

x’
x

[Some slide from 16-385 Computer Vision, www.cs.cmu.edu ] 16


Triangulation
We have R,t and (an estimate of) the
calibrations parameters: given a 3D point,
we know how to project it in both views!

X?

x’
x

17
Triangulation
To use our points cooridnates x, we need to
normalize ("h-normalize") the homogeneous
points:
Same direction but differs
by a scale factor

Cross product of two vectors of same


direction is zero:

18
Triangulation

19
Triangulation

20
Triangulation

Third line is a linear combination of the first and second lines.

21
Triangulation
Use both points:

Is a homogeneous linear system!


Use SVD and take the last (h-normalized)
column of V
22
PnP-based Image Registration
A newly registered image
must observe existing scene
points.
● Use Perspective-n-Point
(PnP) using feature
correspondences to
triangulated points in
already registered images.
– Estimate both R’,t’ and
(optionally) intrinsic
parameters.
– Use a modified version of PnP R,t
that exploits RANSAC and a
minimal pose solver R’,t’
23
3D Points Registration
The newly
registered image
may also increase
scene coverage
by extending the
set of points X
through
triangulation.
R,t
R’,t’
24
Next Best View Selection
Several criteria to select a still not registered image, among others:
– "Sees" the maximum of already registered points, or;
– Use visivility score (e.g., sum #occupied cells * res over a number of
resolutions with res = 2,4,8,...):

25
[J. L. Schanberger and J. Frahm]
Bundle Adjustment
● We may iterate with PnP and triangulation
steps to register new images and new points
but...
● ... they are very correltated procedures:
uncertainties in the camera pose propagate
to triangulated points and vice-versa
● Bundle Adjustment (BA): joint non-linear
refinement of camera parameters and point
parameters
26
Bundle Adjustment

R1,t1

R2,t2
R3,t3 - = reprojection error
R4,t4
27
Bundle Adjustment
BA goal: find a suitable parameters set
(transformations Ti= Ri,ti, calibration
parameters θi , 3D points positions Pj , i=
0, ...Nc – 1, j=0, ..Np - 1) that minimizes the
sum of squares of reprojection errors (one
for each observation of each camera):

Solve with LM 28
BA Parametrization
● Camera positions (#=6*(Nc- 1))
– Set the coordinate system of one of the two seed cameras as reference
world frame
– Use a suitable rotation representation (e.g., with Axis-Angle, 6 DoF for each
camera)
● Calibration parameters (if not given) (#=3*Nc)
– Fix the principal point at the image center (principal point
calibration is an ill-posed problem)
– Optimize for each camera the focal length in pixel (e.g., one
shared focal lenght) and some distortion parameters (e.g., two
radial distortion parameters)
● Scene points (#=3*Np)
– Simply parametrized as 3D points
29
BA Complexity
LM update:

with

a = [a1, ..., am ]T problem parmeters vector, e.g. with


dimension:
m = 6*(Nc- 1) +3*Nc + 3*Np
Warning: matrix inversion requires O(m3)
30
Schur Complement Trick
Idea to efficiently solve BA: exploit J
sparsity and the fact the usually Np >> Nc
C1 C2 C3 X1 X2 X3 X4
Ci : camera i params
Camera 1 Xj : point j params
observations

Camera 2
observations

Camera 3
observations
31
Schur Complement Trick
Rewrite LM step as:
and let define H = JTJ + λI + λI

H=
C E
E P
T
C depends only on cameras,
P depends only on points
32
Schur Complement Trick

Partitioning the step update between


cameras and structure (points):

Camera Parameters
Structures (3D points)

33
Schur Complement Trick

Multiply both sides by

Schur complement of P 34
Schur Complement Trick

● P block diagonal matrix: calculating the


inverse of P by inverting each of its 3x3
blocks is cheap!
– Solve for δac then then backsubstituting it to
obtain the value of δap
● Complexity: O(Nc3)

35
Robust Cost Function
● Standard least square assumes gaussian errors
– High influence from high errors!
● To account for potential outliers, it is better to choose a
proper loss function ρ (e.g., Cauchy, Huber, Tuckey, ...)

36
Sparse to Dense Reconstruction

Use Multi-View Stereo (MVS): see next


lectures...
37
Picture from Sameer Agarwal et al.
Open Source SfM Frameworks
● COLMAP
https://colmap.github.io/
● Bundler: Structure from Motion (SfM) for
Unordered Image Collections
https://www.cs.cornell.edu/~snavely/bundler/
● Multicore Bundle Adjustment
http://grail.cs.washington.edu/projects/mcba/

38

You might also like