BIRLA INSTITUTE OF TECHNOLOGY & SCIENCE, PILANI
WORK INTEGRATED LEARNING PROGRAMMES
Digital
Part A: Content Design
Course Title Computer Vision
Course No(s)
Credit Units 4
Content Authors Ms. Seetha Parameswaran
Version 1.1
Date May 21st 2024
Course Objectives
By the end of this course, students will be able to:
No Course Objective
CO1 Apply a range of computer vision techniques, from low-level image processing to
high-level deep learning methods, to solve complex visual perception problems in
real-world scenarios.
CO2 Analyze the effectiveness of various image processing and feature extraction
algorithms for different computer vision tasks, considering factors such as image
type, noise, and computational complexity.
CO3 Design and implement computer vision systems that integrate multiple techniques
across image processing, segmentation, classification, and object detection to
address challenging visual recognition problems.
CO4 Evaluate the performance of different computer vision algorithms and deep
learning models using appropriate metrics, and critically assess their strengths
and limitations for various applications.
CO5 Critically compare and contrast the effectiveness of various computer vision
techniques across different problem domains, and propose novel solutions to
overcome their limitations in challenging scenarios.
Text Book(s)
R3 Image Processing, Analysis, and Machine Vision: Milan Sonka, Vaclav Hlavac,
Roger Boyle, Fourth edition, Cengage Learning
Reference Book(s) & other resources
R1 Forsyth, D. A., & Ponce, J. (2002). Computer vision: a modern approach. Second
Edition. Prentice hall
R2 Practical Machine Learning for Computer Vision: End-to-End Machine Learning for
Images, O’Rielly, 2021
R3 Szeliski, R., 2022. Computer vision: algorithms and applications. Springer Nature.
Content Structure
1 Computer Vision ( 6 hrs)
1.1 What is Computer Vision?
1.2 Why Computer Vision is hard? (T1 Ch 1.2)
1.3 Applications of Computer Vision (R3 Ch 1.1)
1.4 Image representation and image analysis tasks (T1 Ch 1.3)
1.5 Image digitization - Sampling and resolution (T1 Ch 2.2)
1.6 Digital Images (T1 Ch 2.3)
1.7 Digital Image types -Binary, Gray-scale and Color (Class Notes)
1.8 Color Images (T1 Ch 2.4)
1.9 Color spaces: RGB and HSV (T1 Ch 2.4)
2 Low-level Vision ( 4 hrs)
2.1 Histogram and Histogram equalization (R3 Ch 3.1.4)
2.2 Gray-scale transformation (T1 Ch 5.1.2)
2.3 Image Smoothing (T1 Ch 5.3.1)
2.4 Image Sharpening
2.5 Connected components in images (R3 Ch 3.3.4)
3 Mid-level Vision ( 6 hrs)
3.1 Edge Detection using Gradients, Sobel, Canny (T1 Ch 5.3.2, 5.3.5)
3.2 Line detection using Hough transforms (T1 Ch 5.3.10)
3.3 Histogram of Oriented Gradients
3.4 Corner detection using Harris Corner Detector
3.5 Image region descriptor using SIFT (T1 Ch 10.2)
3.6 Semantic information using RANSAC (T1 Ch 10.3)
4 Object Segmentation ( 4 hrs)
4.1 Types of Segmentation: Semantic vs Instance (Class Notes)
4.2 Segmentation using Agglomerative clustering, Kmeans (R1 Ch 9.3)
4.3 Mean-shift clustering (T1 Ch 7.1)
4.4 Metrics for Object Segmentation (R1 Ch 9.5)
4.5 Popular DNN Architectures for Segmentation
5 Image Classification using Deep Learning ( 4 hrs)
5.1 Introduction to DNN Architectures for image classification
5.2 Metrics for Image Classification (R1 Ch 15.1)
5.2.1 Model Accuracy Metrics
5.2.1.1 Accuracy, Confusion Matrix, TPR, FPR, FNR, Top-K accuracy
5.2.1.2 Precision, Recall, F1 Score
5.2.1.3 AUC-ROC, AUC-PR
5.2.2 Model Performance Metrics
5.2.2.1 FLOPs
5.2.2.2 Memory Footprint for @ specific precision
5.2.2.3 Inference Time on a specific hardware
5.2.3 Metrics for Image Classification.
5.2.3.1 Cross Entroy (Log Loss), Brier Score
5.2.3.2 Macro-Precision, Macro-Recall, Macro-F1
6 Object Detection and Recognition ( 4 hrs)
6.1 Object detection (T1 Ch 9.2)
6.2 Popular Models: YOLO, SSD, Faster-RCNN
6.3 Metrics for Object detection
6.3.1 Average-Precision (AP)
6.3.2 Mean-Average-Precision (mAP)
6.4 Multi label object detection and recognition (Class Notes)
6.4.1 Object Localization → Multilabel Classification
6.4.2 Difference between Multiclass vs Multilabel Classification
7 Object tracking ( 4 hrs) (R1 Ch 11)
7.1 Motion detection
7.2 Tracking by Detection
7.3 Tracking with the Mean Shift Algorithm
7.4 Kalman Filters
7.5 DNN architectures
Some of the Optional Modules can be taken in Experiential Learning / Webinars /
Tutorials / Assignments
1 Face detection and Recognition
2 Optical Character Recognition
3 Medical Imaging and Morphology
4 Remote Sensing Imaging
5 Image Retrieval
6 Edge devices for computer vision
6.1 ESP32 Cam module, Raspberry PI, Banana Pi etc
6.2 Intel, Nvidia (Jetson Nano), Google Coral
Detailed Plan for Lab work
Module
Lab No. Lab Objective
Reference
Reading images
1 Displaying images 1
Color space conversion
Histogram equalization
2 Gray-scale transformation 2
Filtering applications like sharpening, blur, noise removal, smoothing
Edge detection using Sobel and Canny
Line detection using Hough Transform
HoG
3 3
Harris Corner detection
RANSAC for semantic information
SIFT image descriptor
Image segmentation using Kmeans
4 4
Mean-shift clustering for segmentation
Fruit sorting using transfer learning
5 5
Comparison on metrics for evaluation (demo)
Mean shift clustering for object detection
6 6
Object detection using Yolo and Faster RCNN
Mean shift algorithm for object tracking
7 7
Kalman filtering for object tracking
Evaluation Scheme:
Legend: EC = Evaluation Component; AN = After Noon Session; FN = Fore Noon Session
No Name Type Duration Weight Day, Date, Session, Time
EC-1(a) Quizzes Online 10%
EC-1(b) Assignments Take Home 20%
EC-2 Mid-Semester Test Closed Book 30%
EC-3 Comprehensive Exam Open Book 40%
Note:
Syllabus for Mid-Semester Test (Closed Book): Topics in Session Nos. 1 to 8
Syllabus for Comprehensive Exam (Open Book): All topics (Session Nos. 1 to 16)
Important links and information:
Elearn portal: https://elearn.bits-pilani.ac.in.
Students are expected to visit the Elearn portal on a regular basis and stay up to date with
the latest announcements and deadlines.
Contact sessions: Students should attend the online lectures as per the schedule
provided on the Elearn portal.
Evaluation Guidelines:
1 EC-1 consists of two Quizzes. Students will attempt them through the course pages
on the Elearn portal. Announcements will be made on the portal, in a timely
manner.
2 EC-2 consists of either one or two Assignments. Students will attempt them
through the course pages on the Elearn portal. Announcements will be made on the
portal, in a timely manner.
3 For Closed Book tests: No books or reference material of any kind will be
permitted.
4 For Open Book exams: Use of books and any printed / written reference material
(filed or bound) is permitted. However, loose sheets of paper will not be allowed.
Use of calculators is permitted in all exams. Laptops/Mobiles of any kind are not
allowed. Exchange of any material is not allowed.
5 If a student is unable to appear for the Regular Test/Exam due to genuine
exigencies, the student should follow the procedure to apply for the Make-Up
Test/Exam which will be made available on the Elearn portal. The Make-Up
Test/Exam will be conducted only at selected exam centres on the dates to be
announced later.
It shall be the responsibility of the individual student to be regular in maintaining the self-
study schedule as given in the course hand-out, attend the online lectures, and take all the
prescribed evaluation components such as Assignment/Quiz, Mid-Semester Test and
Comprehensive Exam according to the evaluation scheme provided in the hand-out.