06 Local Features
06 Local Features
C280,
  L t
  Lecture 6:
          6 Local
             L l Features
                  F t
       Last Time: Image Pyramids
•   Review of Fourier Transform
•   Sampling and Aliasing
•   Image Pyramids
                 id
•   Applications: Blending and noise removal
      Today:
          y Feature Detection and
               Matching
•   Local features
•   Pyramids for invariant feature detection
•   Invariant descriptors
•   M t hi
    Matching
           Image matching
by Diva Sian
        S
                       by swashford
               Harder case
by Diva Sian
        S                    by scgbt
Harder still?
no chance to match!
Basic steps:
1) Detect distinctive interest points
2) Extract invariant descriptors
C.Harris
C H i and dMM.Stephens.
               St h       "A Combined
                             C bi d C  Corner andd Ed
                                                   Edge D
                                                        Detector.“
                                                           t t “
Proceedings of the 4th Alvey Vision Conference: pages 147--151.
        E (u, v)   w( x, y )  I ( x  u , y  v)  I ( x, y ) 
                                                                      2
x, y
                           1 iin window,
                                  i d    0 outside
                                               id             G
                                                              Gaussian
                                                                   i
                                                                  Source: R. Szeliski
  Harris Detector formulation
This measure of change can be approximated by:
                                 u 
             E (u , v)  [u v] M  
                                 v 
 where M is a 22 matrix computed from image derivatives:
                              I x2         IxI y 
             M   w( x, y )                  2 
                                                      Gradient with
                              I x I y      I y    respect to x,
                 x, y                                 times gradient
                                                      with respect to y
             Sum over image region – area
             we are checking for corner
             M
 Harris Detector formulation
                             I x I y      I y    respect to x,
                x, y                                 times gradient
                                                     with respect to y
            Sum over image region – area
            we are checking for corner
            M
What does this matrix reveal?
          I x2         I I  x y
                                      1 0 
     M                                    
         I x I y      I            0 2 
                                2
                                y
           (max)-1/2
                                (min)-1/2
                                                                 1
Corner response function
 R  det( M )   trace( M ) 2  12   (1  2 ) 2
 α: constant
        t t (0.04
             (0 04 tto 0.06)
                       0 06)
                               “Edge”
                               R<0           “Corner”
                                             R>0
                                        |R| small
                                               ll
                               “Flat”                   “Edge”
                               region                   R<0
        Harris Corner Detector
• Algorithm steps:
   – Compute M matrix within all image windows to get
     their R scores
   – Find points with large corner response
       (R > threshold)
   – Take the ppoints of local maxima of R
   Harris Detector: Workflow
f              Image 1                        f                   Image 2
                               scale = 1/2
f              Image 1                       f         Image 2
                               scale = 1/2
                                                to scale factor.
Percep
Visual
                                                                                                                     48
                                                                              K. Grauman, B. Leibe
                                                Automatic Scale Selection
                                                • Function responses for increasing scale (scale signature)
     ptual
Percep
Visual           ensory Aug
           andReScognition
       Object              Tutorial Computing
                           gmented
                           T
                                                                                             56
                                                                      K. Grauman, B. Leibe
  Characteristic scale
  We define the characteristic scale as the scale
   that produces peak of Laplacian response
                         characteristic scale
T. Lindeberg (1998). "Feature detection with automatic scale selection."
International Journal of Computer Vision 30 (2): pp 77--116. Source: Lana Lazebnik
                                                 Laplacian-of-Gaussian (LoG)
                                                • Interest points:
                                                                                    
                                                  Local maxima in scale
                                                  space of Laplacian-of-
                           Tutorial Computing
                                                  Gaussian                          
                           gmented
                                                               Lxx ( )  Lyy ( ) 
                 ensory Aug
           andReScognition T
                                                                                                      List of
       Object
                                                                                                      (x, y, σ)
     ptual
Percep
Visual
                                                                              K. Grauman, B. Leibe
Scale-space blob detector: Example
Blur
Su btrac
                                                           Candidate keypoints:
                                                           list of (x
                                                                   (x,y,σ)
                                                                      y σ)
Example of keypoint detection
                         ?
             Point descriptor should be:
                1. Invariant
                2. Distinctive
           Local descriptors
• Simplest descriptor: list of intensities within
  a patch.
      t h
• What is this going to be invariant to?
Feature descriptors
Disadvantage of patches as descriptors:
   • Small shifts can affect matching score a lot
Solution: histograms
                              0                2
                                                    Source: Lana Lazebnik
    Feature descriptors: SIFT
    Scale Invariant Feature Transform
    Descriptor computation:
         • Divide patch into 4x4 sub
                                   sub-patches:
                                        patches: 16 cells
         • Compute histogram of gradient orientations (8 reference
           angles) for all pixels inside each sub-patch
         • Resulting
           R    lti ddescriptor:
                            i t 4  4x4x8
                                     4 8 = 128 di
                                               dimensions
                                                        i
                                                                                               Steve Seitz
   Working with SIFT descriptors
• One image yields:
   – n 128-dimensional descriptors: each
     one is a histogram of the gradient
     orientations within a patch
       • [n
         [ x 128 matrix]
                     i ]
   – n scale parameters specifying the size
     of each patch
       • [n
         [ x 1 vector]
                  t ]
   – n orientation parameters specifying the
     angle of the patch
       • [n
         [ x 1 vector]
                  t ]
   – n 2d points giving positions of the
     patches
       • [n x 2 matrix]
       Affine Invariant Detection
   ( proxy for
   (a      f invariance
               i  i     to
                        t perspective
                                 ti transformations)
                                      t  f    ti )
• Above we considered:
  Similarity transform (rotation + uniform scale)
• Now we go on to:
  Affine transform (rotation + non-uniform scale)
       Mikolajczyk: Harris Laplace
1.   Initialization:
     Multiscale Harris
     corner detection      
                   Harris-Laplace points
     Mikolajczyk: Harris Affine
► Based on Harris Laplace
► Using normalization / deskewing
           rotate            rescale
        Mikolajczyk: Harris Affine
1.   Detect multi-
            multi-scale Harris points
                                p
2.   Automaticallly select the scales
     Automatica
3.   Adapt affine shape based on second order moment matrix
4.   Refine point location
Mikolajczyk: affine invariant
       i t
       interest
              t points
                  i t
1.   Initialization: Multiscale Harris corner
     detection
2.   It ti algorithm
     Iterative    l ith
     1.   Normalize window (deskewing)
                                 
     2.   Select integration
                     g        scale (max.
                                    (     of LoG))
     3.   Select differentiation scale (max. min / max)
     4.   Detect spatial localization (Harris)
     5
     5.   Compute new affine transformation (  )
     6.   Go to step 2. (unless stop criterion)
Harris Affine
       Affine Invariant Detection :
                Summary
• Under affine transformation, we do not know in advance
  shapes of the corresponding regions
• Ellipse given by geometric covariance matrix of a region
  robustly approximates this region
• For corresponding regions ellipses also correspond
Other Methods:
1.   Search for extremum along rays [Tuytelaars, Van Gool]:
2.   Maximally Stable Extremal Regions [Matas et.al.]
Feature detector and descriptor summary
  • Stable (repeatable) feature points can be
    detected regardless of image changes
     – Scale: search for correct scale as maximum of appropriate function
     – Affine: approximate
                pp           regions
                               g     with ellipses
                                              p ((this operation
                                                        p        is affine
       invariant)
                       ?
Feature matching
Given a feature in I1, how to find the best match in I2?
   1. Define distance function that compares two descriptors
   2 Test all the features in I2, find the one with min distance
   2.
Feature distance
How to define the difference between two features f1, f2?
   •     Simple approach is SSD(f1, f2)
         –   sum of square differences between entries of the two descriptors
         –   can give good scores to very ambiguous (bad) matches
f1 f2
               I1                                          I2
Feature distance
How to define the difference between two features f1, f2?
   •     Better approach: ratio distance = SSD(f1, f2) / SSD(f1, f2’)
         –   f2 is best SSD match to f1 in I2
         –   f2’ is 2nd best SSD match to f1 in I2
         –   gives small values for ambiguous matches
f1 f2' f2
              I1                                        I2
Evaluating the results
How can we measure the performance of a feature matcher?
                            50
                             75
                            200
                        feature distance
True/false positives
                                 50
                               true match
                                  75
                                 200
                               false match
feature distance
0.7
       # ttrue positives
                  iti             t
                                  true
# matching features (positives) positive
                                  rate
0.7
       # ttrue positives
                  iti             t
                                  true
# matching features (positives) positive
                                  rate
                     Ivan Laptev
                INRIA Rennes,, France
                 ivan.laptev@inria.fr
     Goal:
Interpretation
  of dynamic
    scenes
D fi iti
Definitions:
Gaussian derivative of
Space-time gradient
                   Second-moment matrix
   Space-Time interest points
Properties of :
                      1D space-time
                           p         variation of      , e.g.
                                                           g moving
                                                                  g bar
                 appearance/
accelerations                   split/merge
                disappearance
Space-Time interest points
      Motion event detection
Space-Time interest points
Motion event detection: complex background
Spatio-temporal scale selection
               Stability to size
               changes, e.g.
               camera zoom
Spatio-temporal scale selection
                      Selection of
                      temporal scales
                      captures the
                      frequency of
                      events
          Sequence alignment
   Given a data sequence with the current moment t0,
    detect and classify interest points in the time window of
    l
    length
        th tw: (t0, t0-ttw)
                                           data features
                                           model features
Experiments
Experiments
      Today:
          y Feature Detection and
               Matching
•   Local features
•   Pyramids for invariant feature detection
•   Local descriptors
•   M t hi
    Matching
Lots of applications
Features are used for:
  •   Image alignment (e.g., mosaics)
  •   3D reconstruction
  •   Motion tracking
  •   Object recognition
  •   Indexing and database retrieval
  •   Robot navigation
  •   … other
Object recognition (David Lowe)
Snaptell
     http://snaptell.com/demos/DemoLarge.htm
Photo Tourism
                 Slide Credits
•   Bill Freeman
•   Kristen Grauman
•   S
    Steve  Si
           Sietz
•   Ivan Laptev
•   Tinne Tuytelaars
N t ti
Next time: TTexture
               t