KEMBAR78
Image Stitching and Homography | PDF | Graphics | Imaging
0% found this document useful (0 votes)
25 views21 pages

Image Stitching and Homography

Panorama stitching is the process of merging overlapping images into a wide-angle composite image, relying on features that appear in multiple images for alignment. The document outlines the theoretical foundations, acquisition process, feature detection, matching, homography estimation, image warping, blending, and final processing techniques involved in creating a seamless panorama. It also discusses advanced extensions such as bundle adjustment and real-time stitching applications.

Uploaded by

Daniel Solomon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views21 pages

Image Stitching and Homography

Panorama stitching is the process of merging overlapping images into a wide-angle composite image, relying on features that appear in multiple images for alignment. The document outlines the theoretical foundations, acquisition process, feature detection, matching, homography estimation, image warping, blending, and final processing techniques involved in creating a seamless panorama. It also discusses advanced extensions such as bundle adjustment and real-time stitching applications.

Uploaded by

Daniel Solomon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 21

Comprehensive Notes on Panorama

Stitching
Introduction: Fundamentals of Panorama Stitching
Definition: Panorama stitching is the process of merging multiple
overlapping images into a single wide-angle composite image, creating a
seamless view larger than what any single image can capture.
Purpose: Overlap between images provides redundancy where features
appearing in multiple images can be aligned, allowing the software to infer
spatial relationships between the images.

1. Theoretical Foundation: Image Formation and


Capture
1.1 Image Representation
 Definition: An image is a 2D array (matrix) of pixel values

 Properties:

o Color images contain three channels (RGB)

o Each channel typically uses 8 bits (0-255)

o Each pixel represents a vector [R,G,B]

o Notation: I_i(x,y) denotes intensity at coordinates (x,y) in image


I_i

1.2 Camera Model


 Pinhole Camera Model: Maps 3D world points (X,Y,Z) to 2D image
coordinates (x,y)

 Mathematical Representation:

s[x;y;1] = K·[R|t]·[X;Y;Z;1]

Where:

o K: intrinsic matrix (focal length, optical center)

o R: rotation matrix

o t: translation vector
o s: scale factor (due to homogeneous coordinates)

Theoretical Significance: This model explains why images from different


viewpoints can be geometrically related through transformations.

2. Acquisition Process: Image Set Capture


2.1 Capture Requirements
 Minimum Overlap: Approximately 30% between adjacent images

 Configuration Example: Three overlapping photos [I₁] —— [I₂] ——


[I₃]

 Best Practices:

o Maintain consistent camera settings (exposure, white balance)

o Ideally rotate camera around optical center

o Minimize parallax effects by avoiding translation

2.2 Equipment Considerations


 Tripod recommended for stability

 Panoramic head to minimize parallax

 Consistent lighting conditions

3. Feature Detection
3.1 Concept and Purpose
 Objective: Detect distinct, repeatable visual patterns (corners, edges,
blobs) across images
 Mathematical Representation: For each image I_i, detect features
K_i = {k_1, k_2, …, k_m}

 Example: In image I₁, features might include k_1 = (x=105, y=212),


k_2 = (x=400, y=118), etc.

3.2 Detection Algorithms


3.2.1 Harris Corner Detector
 Principle: Finds corners where intensity gradients change sharply in
both directions

 Mathematical Foundation: Based on eigenvalues of the structure


tensor:
M = ∑w(x,y)[[I_x² I_xI_y]
[I_xI_y I_y²]]

 Properties: Rotation invariant but not scale invariant

3.2.2 SIFT (Scale-Invariant Feature Transform)


 Principle: Detects extrema in scale-space (Difference of Gaussian
pyramids)

 Properties:

o Scale and rotation invariant

o Partially invariant to illumination changes

o Resistant to viewpoint changes

 Process:

1. Scale-space extrema detection


2. Keypoint localization
3. Orientation assignment
4. Keypoint descriptor generation (128-dimensional vector)
3.2.3 ORB (Oriented FAST and Rotated BRIEF)
 Principle: Combines FAST keypoint detector with BRIEF descriptors

 Properties:

o Computationally efficient

o Binary descriptors (faster matching)

o Rotation invariant

3.3 Output Format


For each image I_i:
 Keypoint set: K_i = {k_1, k_2, …, k_m}

 Descriptor set: D_i = {d_1, d_2, …, d_m}, where d_j ∈ ℝ^128 for SIFT

4. Feature Matching
4.1 Objective and Process
 Purpose: Pair keypoints from two images that represent the same
physical point in the scene

 Input:
o Descriptors D_i = {d_1, …, d_m} from I_i

o Descriptors D_j = {d’_1, …, d’_n} from I_j

4.2 Matching Algorithms


4.2.1 Nearest Neighbor Matching
 Principle: For each d ∈ D_i, find d’ ∈ D_j such that ||d - d’||₂ is
minimized

 Implementation: k-d trees or approximate nearest neighbor methods

4.2.2 Match Filtering Techniques


 Lowe’s Ratio Test:

Accept match if: ||d - d'₁||/||d - d'₂|| < 0.75

Where d’₁ and d’₂ are the closest and second-closest descriptors

4.3 Example Output


Matched keypoints between I₁ and I₂:
 (k₁⁽¹⁾, k₄⁽²⁾)

 (k₂⁽¹⁾, k₅⁽²⁾)

 etc.

5. Homography Estimation
5.1 Homography Definition
 Concept: A projective transformation H ∈ ℝ³ˣ³ that maps points from
one plane to another

 Mathematical Expression:

[x'] [h₁₁ h₁₂ h₁₃] [x]


[y'] = [h₂₁ h₂₂ h₂₃]·[y]
[1 ] [h₃₁ h₃₂ h₃₃] [1]

 Applicability:

o Valid when scene is planar

o Valid when camera rotates around a single center

o Approximation for small depth variations


5.2 Direct Linear Transform (DLT)
5.2.1 Problem Formulation
 Requirement: Minimum 4 point correspondences to solve for 8
degrees of freedom

 Cross-product Form: For each match (x,y)↔(x’,y’), we get two linear


equations:

h₁₁x + h₁₂y + h₁₃ - x'(h₃₁x + h₃₂y + h₃₃) = 0


h₂₁x + h₂₂y + h₂₃ - y'(h₃₁x + h₃₂y + h₃₃) = 0

5.2.2 Matrix Construction


 For each correspondence point, generate two rows of matrix A

 Example for point #1: (x,y)=(105,212), (x’,y’)=(110,215):

[-105, -212, -1, 0, 0, 0, 105×110, 212×110, 110]


[0, 0, 0, -105, -212, -1, 105×215, 212×215, 215]

 Complete matrix A is 8×9 (4 points × 2 equations)

5.2.3 Solution via SVD


 Formulation: Ah = 0

 Solve via Singular Value Decomposition (SVD)

 Take h as the last column of V

 Normalize h so h₃₃ = 1

5.3 RANSAC for Robust Estimation


 Purpose: Filter outliers from feature matches
 Algorithm:

1. Randomly sample 4 matches


2. Compute H using DLT
3. Count inliers (where ||p’ - Hp|| < ε)
4. Repeat N times, keep H with most inliers
5. Recompute H using all inliers
5.4 Example Homography Result
H₁₂ = [1.056283 -0.014629 4.436752]
[0.019237 1.053948 -6.068625]
[0.000113 0.000040 1.000000]
6. Image Warping
6.1 Warping Concept
 Definition: Applying H to remap pixels from one image to another
coordinate system

 Process: For every pixel (x’,y’) in destination:

1. Compute (x,y) = H⁻¹(x’,y’)


2. Get pixel value from source image at (x,y)
3. Interpolate if (x,y) is non-integer
6.2 Coordinate Transformation
 Mathematical Expression: (x,y,1)ᵀ ~ H₁₂⁻¹·(x’,y’,1)ᵀ

 Example:

o Output pixel: (x’,y’) = (200, 150)

o Source coordinate: (x,y) ≈ (192.4494, 148.5133)

6.3 Interpolation Methods


6.3.1 Nearest Neighbor Interpolation
 Principle: Use value of nearest pixel

 Properties: Fast but produces jagged edges

6.3.2 Bilinear Interpolation


 Principle: Weighted average of 4 surrounding pixels

 Example Calculation:

I₂(x,y) = (1-Δx)(1-Δy)·I₀₀ + Δx(1-Δy)·I₁₀


+ (1-Δx)Δy·I₀₁ + ΔxΔy·I₁₁

o For point (192.4494, 148.5133):

 x₀ = 192, x₁ = 193, y₀ = 148, y₁ = 149

 Δx = 0.4494, Δy = 0.5133

 I₀₀ = 100, I₁₀ = 102, I₀₁ = 105, I₁₁ = 107

 Result ≈ 103.4651

6.3.3 Bicubic Interpolation


 Principle: Uses 16 surrounding pixels for smoother results

 Properties: Higher quality but more computationally intensive


7. Image Blending
7.1 Blending Purpose
 Objective: Eliminate seams and brightness differences in overlapping
regions

 Challenges: Exposure differences, vignetting, parallax effects

7.2 Blending Techniques


7.2.1 Feathering (Linear Blending)
 Principle: Weighted average based on distance from overlap edge

 Mathematical Expression: I_final = α·I₁ + (1-α)·I₂

 Example:

o Overlap from x’ = 400 to x’ = 600

o Weight α = (x’-400)/(600-400)

o For pixel at x’ = 450: α = 50/200 = 0.25

o I₁(450,y’) = 120, I₂_warp(450,y’) = 130

o I_blend = 0.75·120 + 0.25·130 = 122.5

7.2.2 Pyramid Blending


 Principle: Multi-resolution approach for seamless integration

 Process:

1. Construct Laplacian pyramids for each image


2. Blend each level with Gaussian mask
3. Reconstruct final image
 Properties: Better handling of high-frequency details

7.2.3 Multiband Blending


 Principle: Different frequency bands are blended differently

 Properties: Superior results for complex seams

7.3 Advanced Blending Methods


7.3.1 Gradient Domain Fusion (Poisson Blending)
 Principle: Match gradient fields rather than direct pixel values

 Process: Solve Poisson equation to find seamless transition


7.3.2 Histogram Matching
 Principle: Normalize color intensity distributions between images

 Process: Transform histogram of one image to match another

7.3.3 Gain Compensation


 Principle: Global adjustment of image brightness

 Process: Optimize gain factors for each image

8. Multiple Image Stitching


8.1 Sequential Stitching
 Process:

1. Detect & match keypoints between blended result and I₃


2. Compute homography H_{(1+2),3} via DLT+RANSAC
3. Warp I₃ into existing canvas
4. Blend using feathering or multiband pyramid
8.2 Global Alignment
 Principle: Jointly optimize all homographies to minimize global error

 Methods:

o Bundle adjustment

o Global homography estimation

o Graph-based optimization

9. Final Processing and Output


9.1 Cropping
 Purpose: Remove empty (black) regions

 Process:

1. Compute convex hull of all warped pixel footprints


2. Take axis-aligned bounding box: [x_min, x_max] × [y_min,
y_max]
3. Crop to that box
 Example:

o Warped extents: x ∈ [-10, 1010], y ∈ [5, 605]

o Crop to: x ∈ [0, 1000], y ∈ [5, 605]


o Result: 1000×600 final panorama

9.2 Color Adjustment


 Techniques:

o White balance correction

o Contrast stretching

o Tone mapping

9.3 Post-Processing Enhancements


 Options:

o Sharpening

o Noise reduction

o Vignette correction

10. Advanced Extensions


10.1 Bundle Adjustment
 Principle: Joint optimization of camera poses and homographies

 Process: Minimize reprojection error across all images simultaneously

10.2 3D Structure from Motion


 Principle: Infer scene depth before stitching

 Advantage: Better handling of parallax effects

10.3 Real-time Stitching Applications


 Purpose: 360° video and VR content creation

 Challenges: Computational efficiency, continuous alignment

11. Variable Catalog: Key Components


Symb
ol Description Mathematical Domain
I_i Input image 2D matrix
K_i Keypoints in image i Set of 2D coordinates
D_i Descriptors Set of n-dimensional
vectors
M_{ij} Matches between images i and j Set of coordinate pairs
Symb
ol Description Mathematical Domain
H_{ij} Homography mapping image j to 3×3 matrix
i
W_i Warped image 2D matrix
I_final Final panorama 2D matrix

12. Concrete Implementation Example


12.1 Setup and Data
Correspondences between I₁ and I₂:

Inde
x I₁ (x,y) I₂ (x’,y’)
1 (105, (110,
212) 215)
2 (400, (405,
118) 120)
3 (300, (303,
300) 302)
4 (150, (152,
400) 405)

12.2 Implementation Steps


1. Feature Detection

o Detect keypoints in each image using SIFT

o Example: K₁ = {(105, 212), (400, 118), (300, 300)}

2. Feature Matching

o Match features between images

o Example: M₁₂ = {((105,212), (110,215)), ((400,118), (405,120))}

3. Homography Computation

o Construct matrix A from correspondences

o Solve via SVD to obtain H₁₂

4. Image Warping

o Apply H₁₂⁻¹ to map coordinates


o Use bilinear interpolation for non-integer coordinates

5. Image Blending

o Apply feathering in overlap regions

o Example: For x’ = 450, α = 0.25, resulting value = 122.5

6. Final Processing

o Crop to content bounding box

o Apply color adjustments if needed

1. Setup & Correspondences


We have three overlapping images of a cityscape:
[ I₁ ] —— [ I₂ ] —— [ I₃ ]

We detect keypoints (e.g. via SIFT) and match them. For simplicity, we’ll
work pairwise on I 1↔ I 2. We use four correspondences so we can solve for an
8-DOF homography.

Inde
x I1 ( x , y) I2 ( x' , y ' )
1 (105, 21 (110, 21
2) 5)
2 (400, 11 (405, 12
8) 0)
3 (300, 30 (303, 30
0) 2)
4 (150, 40 (152, 40
0) 5)

2. Homography H 12 via Direct Linear Transform (DLT)


A planar homography H satisfies (in homogeneous coords)

[ ] [] ( )
x' x h11 h12 h13
y ' ∼ H y , H = h21 h22 h23 .
1 1 h31 h32 h33

For each match ( x , y ) ↦ ( x ' , y ' ), we get two linear equations (cross-product
form):
{h11 x+ h12 y +h13−x ' ( h31 x+ h32 y +h33 ) =0 ,
h21 x +h22 y+ h23− y ' ( h31 x + h32 y +h33 )=0 .

Writing these for all four points gives an 8 × 9 matrix equation A h=0, where
T
h=( h11 , h12 , h13 , h21 , h22 , h23 , h31 , h32 , h33 ) .

2.1. First two rows of A


For point #1: ( x , y ) =( 105 ,212 ), ( x ' , y ' )=( 110 , 215 ) :
−¿ 105 h31 ⋅110−212h 32 ⋅110−h33 ⋅ 110+105 h11 + 212h 12+h13=0 ,
−¿ 105 h31 ⋅215−212 h32 ⋅215−h33 ⋅215+105 h21 +212 h22+ h23=0 .

Numeric row entries become:


[−105 ,¿−212 ,−1 , 0 , 0 ,0 , 105 ×110 ,212 ×110 ,110], .¿
¿
and similarly for the other three points.

2.2. Solve by SVD


We stack all eight rows into A , perform SVD, and take h as the last column of
V . Normalizing h so h33=1 yields:

( )
1.056283 −0.014629 4.436752
H 12= 0.019237 1.053948 −6.068625 .
0.000113 0.000040 1.000000

3. Warping I2 into I 1’s Frame


For each pixel ( x ' , y ' ) in the output canvas, we compute its source coordinate
in I 2 as

( x , y , 1 )T ∼ H−1 T
12 ( x ' , y ' ,1 ) .

Let’s pick one sample output pixel:


( x ' , y ' )=( 200 , 150 ) .
Compute

( )( )
200 192.4494
−1
p=H 12 150 ≈ 148.5133 .
1 1

Thus the source coordinate is ( x , y ) ≈ ( 192.4494 , 148.5133 ) .


3.1. Bilinear interpolation
Let
x 0=192 , x1 =193 , y 0 =148 , y1 =149 , Δx=0.4494 , Δy=0.5133.

Suppose the four surrounding pixel intensities in I 2 are


I 00=I 2 ( 192 ,148 )=100 , I 10=102 , I 01=105 , I 11=107 .

Then bilinear interpolation gives


I 2( x , y ) ¿ (1−Δx ) ( 1− Δy ) I 00 + Δx ( 1− Δy ) I 10
¿ ≈ ( 0.5506 )( 0.4867 ) 100+ ( 0.4494 ) ( 0.4867 ) 102
¿ ≈ 103.4651 .

4. Blending I1 & Warped I2

Assume their overlap in the panorama runs from x '=400 to x '=600 . For any
pixel x ' in that interval, define a linear weight
x ' −400
α= .
600−400
50
Take x '=450 ⟹ α= =0.25 . If the (already-warped) pixel values are
200
warp
I 1 ( 450 , y ' )=120 , I 2 ( 450 , y ' )=130 ,

then the feathered blend is


I blend =( 1−α ) I 1+ α I 2 =0.75 ⋅120+0.25 ⋅130=122.5 .

5. Stitching I3 Onto the Result


1. Detect & match keypoints between the blended result and I 3.
2. Compute homography H (1 +2) ,3 via DLT+RANSAC (exactly as above).
3. Warp I 3 into the existing canvas.
4. Blend using the same feathering weights or multiband pyramid (for
smooth seams).

6. Final Cropping & Color Adjustment


After warping, the composite canvas will usually have “empty” (black)
regions:
 Compute the convex hull of all warped pixel footprints.

 Take its axis-aligned bounding box: e.g. [ x min , x max ] × [ y min , y max ].

 Crop to that box: removes black borders.

 Optionally apply:

o Histogram matching or gain compensation across seams.

o Poisson blending (gradient-domain) for invisible seams.

o Global color adjustment (white balance, contrast stretch).

In our toy numbers, if the warped extents were


x ∈ [ −10 ,1010 ] , y ∈ [ 5 ,605 ] ,
we’d crop to
x ∈ [ 0 ,1000 ] , y ∈ [ 5 , 605 ] ,
yielding a 1000 ×600 final panorama.

7. Recap of Every Variable & Operation


Step Variable(s) Operation
1. Keypoint K i, Di SIFT / Harris → ( x , y ) , 128-D vectors
detection
2. Matching M ij Euclid. distance + Lowe’s ratio test
3. Build A h=0 A∈R
8 ×9
Fill rows from each match’s
equations
4. Solve via SVD h Last singular vector of A
5. Form H 12 H ∈R
3×3
Reshape h , normalize h33=1
6. Invert H H
−1
Matrix inverse
7. Warp pixels ( x ' , y ' )→ ( x , y ) ( x , y , 1 )=H −1 ( x ' , y ' , 1 )
8. Interpolation Bilinear Based on fractional parts Δx , Δy
weights
9. Blending α (x ') Feather: α =( x '− xmin ) / ( x max −x min )
10. Crop & adjust Bounding box Remove black, color-correct
🔥 INTRODUCTION: What Is Panorama Stitching?
Panorama stitching is the process of merging multiple overlapping
images into a single wide-angle composite image. The goal is to create a
seamless view larger than what any single image can capture.
Why overlapping? Overlap provides redundancy: features that appear in
more than one image can be aligned, allowing the software to infer spatial
relationships between the images.

🪞 PHASE 0: Image Formation and Capture


🧠 Foundational Concepts
What is an image?
An image is a 2D array (matrix) of pixel values. In color images:
 Each pixel has three color channels (Red, Green, Blue).

 Each channel is usually 8 bits (0–255), so each pixel is a vector like


[ R ,G , B ] .
Let:
 I i ( x , y ) denote the intensity at coordinates ( x , y ) in image I i.
What is a camera?
A camera performs a projective transformation: it maps 3D world points
( X , Y , Z ) to 2D image coordinates ( x , y ) via the pinhole camera model:

[] []
X
x
Y
s y =K ⋅ [ R∨t ] ⋅
Z
1
1

Where:
 K : intrinsic matrix (focal length, optical center).

 R : rotation matrix.

 t : translation vector.

 s: scale factor (due to homogeneous coordinates).

This model underpins why images from different viewpoints can be


geometrically related.
📷 PHASE 1: Image Set Acquisition
Let’s say we take 3 photos I 1 , I 2 , I 3 of a cityscape by rotating the camera
horizontally.
These photos will have overlapping regions:
[ I_1 ] ---- [ I_2 ] ---- [ I_3 ]

For proper stitching:


 Minimum ~30% overlap.

 Same camera settings (exposure, white balance) preferred to ease


blending.

🔍 PHASE 2: Feature Detection


❓ Why detect features?
We need distinct, repeatable visual patterns (like corners, edges, blobs)
that can be found across images to align them.

🧠 Mathematical Idea:
Detect features K i={k 1 , k 2 ,... , k m } in each image I i.
Example:
In image I 1, we might detect:

 k 1=( x=105 , y=212 )

 k 2=( x=400 , y=118 )

 etc.

🛠 Algorithms:
1. Harris Corner Detector

o Finds corners where intensity gradients change sharply in both


directions.

o Based on eigenvalues of the structure tensor:


[ ]
2
Ix IxIy
M =∑ w ( x , y ) 2
Ix I y Iy

2. SIFT (Scale-Invariant Feature Transform) 🥇

o Detects extrema in scale-space (DoG pyramids).

o Computes orientation histograms around keypoints.

o Produces 128-D descriptors per keypoint.

3. ORB (Oriented FAST and Rotated BRIEF) — fast alternative.

Output:
For each image I i:
 K i={k 1 , k 2 ,... , k m }

 Descriptors: Di={d 1 ,d 2 , ..., d m }, where d j ∈ R 128 (for SIFT).

🔗 PHASE 3: Feature Matching


❓ What are we doing?
We’re pairing keypoints from two images that represent the same
physical point in the scene.

🔬 Matching Strategy:
Given:
 Descriptors Di={d 1 ,... , d m } from I i

 Descriptors D j={d1 ' , ... ,d n ' } from I j

For each d ∈ Di , find d ' ∈ D j such that:


∥ d−d ' ∥ 2 is minimized

🧪 Match Filtering:
 Lowe’s Ratio Test:
∥ d−d 1 ' ∥
Accept match if: < 0.75
∥ d−d 2 ' ∥

(Where d 1 ' and d 2 ' are the closest and second-closest descriptors in D j).
🧾 Example Output:
Matched keypoints between I 1 and I 2:

 ( k (11 ) , k (42) )
 ( k (21 ) , k (52) )

🧮 PHASE 4: Homography Estimation


❓ What is a homography?
A homography matrix H ∈ R3 × 3 is a projective transformation that maps
points from one plane to another:

[ ] []
x' x
y ' =H ⋅ y
1 1

This holds if the scene is planar or the camera rotates around a single
center.

🔍 Finding H
We need at least 4 point correspondences to solve:
p '=H ⋅ p ⇒ p ' × ( H ⋅ p )=0
This leads to a system of linear equations in the 8 unknowns (9 values in H
, but homogeneous scale makes 1 redundant).

✅ RANSAC for Robustness


To filter outliers: 1. Randomly sample 4 matches. 2. Compute H . 3. Count
inliers (where ∥ p '−Hp ∥< ϵ ). 4. Repeat N times, keep H with most inliers.

🎯 PHASE 5: Image Warping


❓ What is warping?
Applying H to remap pixels from one image to another coordinate system.
For every pixel ( x ' , y ' ) in the destination:

 Compute ( x , y ) =H−1 ( x ' , y ' )

 Get the pixel value from source image at ( x , y )


 Interpolate if ( x , y ) is non-integer

🔢 Interpolation Methods:
 Nearest neighbor

 Bilinear (most common)

 Bicubic

🌈 PHASE 6: Image Blending


❓ Why blend?
To eliminate seams and brightness differences in overlapping regions.

⚙️Methods:
1. Feathering: Weighted average:

I final =α I 1 + ( 1−α ) I 2

Where α depends on distance from overlap edge.

2. Pyramid Blending:

o Construct Laplacian pyramids for each image.

o Blend each level with Gaussian mask.

o Reconstruct final image.

3. Multiband blending (for complex seams).

🧼 PHASE 7: Seam Removal & Color Correction


Techniques:
 Gradient domain fusion (Poisson blending): Match gradient fields.

 Histogram matching or gain compensation: Normalize color


intensity.

✂️PHASE 8: Cropping and Post-Processing


 Identify bounding box of non-empty pixels.
 Crop excess black regions.

 Optionally apply:

o Sharpening

o Contrast stretching

o Tone mapping

🔑 KEY VARIABLES RECAP


Symbo
l Description
Ii Input image
Ki Keypoints in I i
Di Descriptors of K i
M ij Matches between I i and I j
H ij Homography matrix mapping
I j →Ii
Wi Warped image
I final Final panorama

🏁 CONCRETE EXAMPLE (with values)


Suppose we have 3 images I 1 , I 2 , I 3 of a city:

Step-by-Step:
1. Detect keypoints in I 1: K 1={ ( 105 ,212 ) , ( 400 , 118 ) , ( 300 , 300 ) } Descriptors
are 128-D vectors.

2. Match to I 2: M 12={( ( 105 , 212 ) , ( 110 , 215 ) ) , ( ( 400 , 118 ) , ( 405 , 120 ) ) }

3. Compute H 12 using RANSAC.

4. Warp I 2 with H 12.

5. Blend I 1 and I 2 using feathering.

6. Repeat for I 3.

7. Crop final image.


🧠 BONUS: Advanced Extensions
 Bundle adjustment: Joint optimization of camera poses and
homographies.

 3D structure from motion: Infer scene depth before stitching.

 Real-time stitching: For 360° video and VR.

You might also like