Computer Vision
├── 1. Fundamentals
│ ├── What is CV? | Why is CV hard?
│ ├── Applications (OCR, Biometric, AR/VR, etc.)
│ ├── Image Digitization
│ │ ├── Sampling, Quantization
│ │ ├── Resolution: Spatial & Intensity
│ └── Color Spaces (RGB, HSV)
├── 2. Low-Level Vision
│ ├── Histogram & Histogram Equalization (*Numerical*)
│ ├── Intensity Transformations
│ │ ├── log, power law - gamma correction
│ │ ├── Contrast stretching, thresholding
│ │ ├── Intensity level slicing
│ └── Spatial Filtering
│ ├── Convolution , Corelation (*Numerical*)
│ ├── Box Filter, Fuzzy Filter, Gaussian Filter
│ ├── Smoothing, Sharpening, Noise Removal
│ ├── Convolution Kernels (Mean, Gaussian, Laplacian)
├── 3. Mid-Level Vision
│ ├── Edge Detection
│ │ ├── Gradient: Prewitt, Sobel (*Numerical*)
│ │ ├── Canny Edge Detection (*Numerical*)
│ ├── Line Detection
│ │ └── Hough Transform (*Numerical*)
│ ├── Local Features
│ │ ├── Harris Corner Detector (*Numerical*)
│ │ └── SIFT, HoG
│ ├── Model Fitting
│ │ └── RANSAC
Low-Level Vision
1. Sampling
2. Quantization
3. Resolution (Spatial & Intensity)
Result:
• The pixel values are now spread more evenly across the full range (1 to 4 instead of clumped).
• Contrast is enhanced!
Intensity Transformations — essential low-level operations used to enhance or modify image brightness and
contrast
Spatial Filtering — a foundational concept in low-level vision and image enhancement. This is where we
manipulate pixels based on their neighborhoods using kernels (also called masks or filters).
Filter Kernel (3×3) Method & Use Cases Pros & Cons
Box Filter Averaging filter; smooths the Simple, fast;
(Mean) image by replacing each reduces small
pixel with the mean of its noise Blurs
neighborhood. Used for edges and fine
basic blur or mild noise detail
reduction.
Gaussian Weighted average based on Smooths
Filter distance from center. without heavy
Used for edge-preserving edge loss
smoothing, pre-processing Slightly more
before edge detection. compute, still
blurs fine
structures
Median (No fixed kernel — median of 3×3 Replaces center pixel with Preserves
Filter neighborhood) the median of the edges well, great
surrounding values. for impulsive
Excellent for removing salt & noise Slower;
pepper noise. less effective for
Gaussian noise
Laplacian Computes 2nd derivative Highlights
Filter (center minus neighbors). rapid intensity
Used for edge changes
enhancement and Sensitive to
sharpening. noise; amplifies
grain
Sobel Filter First derivative (gradient) Captures
(Horizontal) operator. Detects edges gradient
in specific directions (H/V). direction; better
Common in edge detection than Prewitt
and feature extraction. Still sensitive to
noise
Prewitt Filter Similar to Sobel but with Simpler than
(Horizontal) uniform weights. Detects Sobel Less
vertical and horizontal accurate; weaker
edges; often used in edge detection
educational/demo settings.
High-Pass Emphasizes high-frequency Sharpens and
Filter details like edges and enhances details
textures. Used for image Very sensitive
sharpening. to noise
The Canny Edge Detector is one of the most important and widely used edge detection algorithms in computer
vision — combining multiple steps to produce accurate, thin, and connected edges. Introduced by John Canny
(1986), this algorithm is designed to detect only real and significant edges, with minimal noise and thin lines
Let's now walk through a step-by-step numerical example of the Canny Edge Detection algorithm using a small
5×5 grayscale image, simplified values, and showing what happens at each stage.
Goal:
Apply Canny edge detection manually in stages:
We'll cover:
1. Gaussian Smoothing
2. Gradient Computation (Sobel)
3. Gradient Magnitude and Direction
4. Non-Maximum Suppression
5. Double Thresholding
6. Edge Tracking by Hysteresis
Canny Edge Detection (without Gaussian smoothing) on your 5×5 image, keeping horizontal as x-axis and
vertical as y-axis — as per convention in computer vision.
Goal:
Detect corner points in an image — locations where intensity changes in both x and y directions.
These are highly distinctive and repeatable features, useful for:
• Image matching
• Tracking
• Object recognition
• SLAM (in robotics)
The Harris Corner Detector is a keypoint detection algorithm used to find interest points in an image —
specifically, corners.A corner is a point in an image where the intensity changes significantly in multiple
directions (e.g., an L-shape or checkerboard corner). These points are highly distinctive, making them ideal
for:
Feature Harris Corner SIFT HoG RANSAC
Type Keypoint detector Keypoint detector + Dense descriptor (no Robust model fitting
descriptor keypoint detection) algorithm
Input Grayscale image Grayscale image Grayscale image (fixed- Set of data points (can
size window) be 2D or 3D)
Output Corner points Keypoints + 128D Vector describing Best-fit model (e.g., line,
descriptors gradients in window homography)
Invariant to: Rotation (approx.) Scale, Scale, Rotation, Outliers
Rotation, Lighting (partial)
Illumination
Scale- No Yes No Yes (statistical
Invariant robustness)
Rotation- Approx. Yes No (unless Yes (depends on
Invariant manually rotated) model)
Descriptor N/A 128 ~3780 (for 64×128 N/A (returns model, not
Dim. window) vector)
Uses Yes (image Yes (for orientation Yes (for orientation Depends (uses
Gradients gradients) + descriptors) histograms) distances/errors)
Main Use Corner detection Feature matching, Object detection (e.g., Fitting lines,
tracking pedestrians) homographies, 3D
planes
Strengths Simple, fast Robust, distinctive Captures local Robust to large % of
features structure and edge outliers
patterns
Limitations Not Slower, high-dim Requires fixed Randomized, may fail
scale/rotation descriptors cell/block sizes with poor thresholding
invariant
Key SLAM, tracking Image stitching, Human/vehicle Line fitting, homography
Applications matching detection estimation
Quick Reference