0% found this document useful (0 votes)

15 views67 pages

Module3 OTML

The document outlines a course on optimization techniques in machine learning, focusing on Principal Component Analysis (PCA) for dimensionality reduction. It explains the motivation for PCA, its mathematical foundations, and its applications in real-world scenarios such as image compression and finance. The course aims to equip students with the ability to analyze and operationalize machine learning models while addressing challenges posed by high-dimensional data.

Uploaded by

Srinivas Redyy Sarvigari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views67 pages

Module3 OTML

Uploaded by

Srinivas Redyy Sarvigari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 67

A8751 – Optimization Techniques in

Machine Learning
Course Overview:
The students will be able to understand and analyze how to deal
with changing data. They will also be able to identify and interpret
potential unintended effects in your project. They will understand
and define procedures to operationalize and maintain your
applied machine learning model.

Edited By Mr S Srinivas Reddy Asst Professor

Module 3:
Dimensionality Reduction and Optimization
Based on Mathematics for Machine Learning by
Deisenroth et al.

Chapters Referenced:
Chapter 10 (Dimensionality Reduction with PCA) of
the uploaded Mathematics for Machine Learning
Module 3: Dimensionality Reduction and Optimization

Problem Setting, Maximum Variance Perspective,

Projection Perspective, Eigenvector Computation and
Low-Rank Approximations, PCA in High Dimensions, Key
Steps of PCA in Practice, Latent Variable Perspective
Motivation & Intuition
"Imagine we collect data on 5 characteristics of students — height,
weight, exam score, attendance, and class participation. This is a 5-
dimensional dataset. Visualizing and analyzing it is hard. But what if
we could summarize most of this information in just 2 numbers — and
still capture nearly all the differences between students?“

This is exactly what PCA (Principal Component Analysis) does.

Objective : Reduce dimensions but retain maximum useful

information.
What is the Problem Setting in PCA???

 Principal Component Analysis- full form

 Linear dimensionality reduction technique.-type

 Transform high-dimensional data into a lowe-

dimensional space while retaining as much
information (variance) as possible.- Objective
Why do we need PCA?

 Real-world datasets often have many features (e.g., 100s or 1000s).

 Many of these features are correlated or redundant.
 Working with all features leads to:
 High computation cost.
 Difficulty in visualization.
 Overfitting due to noisy or irrelevant features.
 PCA helps by:

 Finding new uncorrelated variables (principal components).

 Keeping only the most important components (those with
highest variance).
Why do we need this?

•High-dimensional data often lies on a

low-dimensional subspace.
•Many features are redundant or correlated.
•Working in lower dimensions reduces storage,
computation, and improves visualization.
PCA-
To find new directions (called principal components) along which the data
varies the most.
These directions are orthogonal (perpendicular) to each other.
•Why do this?
• Reduce dimensionality (2D → 1D or higher → lower).
• Remove redundancy between correlated features.
• Focus on most informative directions.

Principal Components
• The directions we find are the eigenvectors of the covariance matrix
• The importance of each direction is given by the eigenvalues (they tell
how much variance lies along that direction).
Eigenvectors and Eigenvalues

Eigenvectors: Directions (axes) along which the data shows

maximum variance.

Eigenvalues: Amount of variance captured along each eigenvector.

Key Idea
 PCA transforms original correlated features into new
uncorrelated features (principal components).
 The first principal component = eigenvector with largest
eigenvalue (maximum variance).
Real-World Examples

•Image Compression: Reduce pixels from 784 to 50 while preserving

shape of digit (MNIST dataset).

•Face Recognition: PCA generates “eigenfaces” for efficient storage and

recognition.

•Finance: Reduce correlated stock indicators into a few principal factors.

•Weather Data: Temperature and humidity projected into one dimension

for seasonal trend analysis.
How does PCA work (Conceptual)?

1.Data as points in high-dimensional space

Example: Each 28×28 pixel image = a point in 784-D space.
2.Variance as Information
Directions where data varies most = most informative.
3.Principal Components
1.First principal component (PC1): Direction of maximum variance.
2.Second principal component (PC2): Next orthogonal direction of
maximum variance.
3.And so on.
4.Projection
1.Project original data onto first few components.
2.New representation = lower dimension but preserves most variance.
PCA with Covariance Matrix-
Problem:
PCA with Covariance Matrix-
Problem:
Solution:

1. Interpret Covariance Matrix

Diagonal elements (2, 2): Variance of each feature = 2 → both features vary
equally.
Off-diagonal elements (1): Covariance = 1 → positive correlation: larger
engines generally have lower fuel efficiency (inverse relationship visible after
sign analysis).
Step 2: Find Principal Components (Systematic Approach)
Step 2.1: Write the Covariance Matrix

Given covariance matrix:

Step 2.2: Compute Eigenvalues

Step 2.3: Compute Eigenvectors

Step 2.4: Order Eigenvalues and Select Principal Components

Step 2.5: Variance Explained

Maximum Variance Perspective of PCA

What is the idea?

We have high-dimensional data (e.g., 2D or 3D) and want to reduce it to
fewer dimensions (e.g., 1D) while keeping maximum information.

Information = spread of data = variance.

So, choose a line (direction) where the variance of projected data is

maximum.
Information = spread of data = variance.
1. What is spread of data
 Spread = how far data points are from the center (mean).
 If data points are close to mean → small spread.
 If data points are far from mean → large spread.
 Mathematically, spread is measured by distance from mean.

2 .Measuring distance: deviation from mean--

Maximum Variance Perspective of PCA step-by-step
solution-
• PCA looks for a line (direction) where the data
points are spread out the most.

• This line is called the

first principal component (PC1).

• By projecting data onto this line, we keep most

of the important information but in fewer
dimensions.
Projection Perspective:

Instead of thinking about “maximum spread,” projection perspective

looks at it like this:

“If we drop the data onto a line, which line gives the least
reconstruction error?”

In other words, we want the line where points stay closest to their
original positions after projection and reconstruction.
There are two complementary ways to understand PCA
mathematically:
1.Maximum Variance Perspective (already studied):
1.Finds directions (principal components) with maximum
variance.
2.Equivalent to finding eigenvectors of the covariance matrix.
2.Projection Perspective (to study now):
1.Minimizes reconstruction error when projecting data onto a
subspace.
2.Equivalent mathematically to the variance perspective but
focuses on error minimization.
Motivation
•In real-world applications, we often project high-dimensional data onto
fewer dimensions (like a plane or line).
•The question: How do we choose this projection to minimize the
information lost?
Maximum Variance Perspective says: “Pick the direction with maximum
spread.”
Projection Perspective says: “Pick the subspace that gives the smallest
reconstruction error after projection.”
Both lead to the same principal components but are derived differently:
•Variance View: Maximize variance of projected data.
•Projection View: Minimize error of reconstructing original data from
projection.
How It Works

1.Choose a line (direction) → candidate principal component.

2.Project data points onto this line (like casting shadows).

3.Reconstruct back (lift shadows back to 2D).

4.Measure error:
1.Error = distance between original point and reconstructed point.

5.Find direction that gives smallest total error.

Interpretation:
•When we reduce dimensions (2D → 1D), we lose some information.

•PCA keeps the direction of maximum variance (PC1) and ignores the second

direction (PC2).

•The error represents information lost in the ignored direction.

•Error = 0.06650.06650.0665 (very small)

•Meaning: Projection onto PC1 retains almost all the information (≈95.9% variance

kept).

•The lost variance (≈4%) is small, so 1D is a good approximation.

10-601 Machine Learning (Fall 2010) Principal Component Analysis
No ratings yet
10-601 Machine Learning (Fall 2010) Principal Component Analysis
8 pages
Dimensionality Reduction Technique
No ratings yet
Dimensionality Reduction Technique
17 pages
Principal Component Analysis and Cluster Analysis
No ratings yet
Principal Component Analysis and Cluster Analysis
14 pages
Unit 3
No ratings yet
Unit 3
102 pages
Data Pre-Processing-IV (Feature Extraction-PCA)
No ratings yet
Data Pre-Processing-IV (Feature Extraction-PCA)
23 pages
Principal Component Analysis Guide
No ratings yet
Principal Component Analysis Guide
23 pages
PCA Guide for B.Tech Students
No ratings yet
PCA Guide for B.Tech Students
10 pages
PCA
100% (1)
PCA
33 pages
U5@-Data Reduction
No ratings yet
U5@-Data Reduction
22 pages
IDS 4 (Week 14)
No ratings yet
IDS 4 (Week 14)
66 pages
3.2 Pca
No ratings yet
3.2 Pca
27 pages
Love Report 1
No ratings yet
Love Report 1
10 pages
Week12 PCA BayesianInference Before Lecture
No ratings yet
Week12 PCA BayesianInference Before Lecture
82 pages
Cheat Sheet
No ratings yet
Cheat Sheet
2 pages
Day 8
No ratings yet
Day 8
25 pages
Machine Learning (CSO851) - Lecture 03
No ratings yet
Machine Learning (CSO851) - Lecture 03
71 pages
Dimensionality Reduction Techniques in Data Mining Aim To Reduce The Number of Features
No ratings yet
Dimensionality Reduction Techniques in Data Mining Aim To Reduce The Number of Features
9 pages
Pca 1
No ratings yet
Pca 1
3 pages
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
No ratings yet
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
19 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
19 pages
Dim Reduction & Pattern Recognition
No ratings yet
Dim Reduction & Pattern Recognition
63 pages
Lecture 9 - Data Prep - Reduction - PCA-M
No ratings yet
Lecture 9 - Data Prep - Reduction - PCA-M
44 pages
Dimensionality Reduction Explained
No ratings yet
Dimensionality Reduction Explained
60 pages
1 Principal Component Analysis (PCA) : Complete Lecture Notes
No ratings yet
1 Principal Component Analysis (PCA) : Complete Lecture Notes
22 pages
PCA Dev
No ratings yet
PCA Dev
16 pages
What Is PCA?: Image Source
No ratings yet
What Is PCA?: Image Source
17 pages
ML Mod 4 Part 2
No ratings yet
ML Mod 4 Part 2
32 pages
Dimensionality Reduction Using Principal Component Analysis
No ratings yet
Dimensionality Reduction Using Principal Component Analysis
32 pages
Data Analysis: Dr. C Santhosh Kumar
No ratings yet
Data Analysis: Dr. C Santhosh Kumar
22 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
11 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
Lecture 9 - Data Reduction
No ratings yet
Lecture 9 - Data Reduction
36 pages
Love Report
No ratings yet
Love Report
7 pages
U4 - PCA - 5th Sem - DS
No ratings yet
U4 - PCA - 5th Sem - DS
14 pages
Feature Extraction: - Saheni Patra
No ratings yet
Feature Extraction: - Saheni Patra
17 pages
Principal Component Analysis1
No ratings yet
Principal Component Analysis1
26 pages
Pattern Recognition Techniques
No ratings yet
Pattern Recognition Techniques
13 pages
Ai (PCA)
No ratings yet
Ai (PCA)
3 pages
PCA in Machine Learning Explained
No ratings yet
PCA in Machine Learning Explained
33 pages
AI Unsupervised Learning Guide
No ratings yet
AI Unsupervised Learning Guide
44 pages
Linear Regression: Dimensionality Reduction
No ratings yet
Linear Regression: Dimensionality Reduction
7 pages
The Math Behind PCA
No ratings yet
The Math Behind PCA
3 pages
Data Analysis: Dr. C Santhosh Kumar
No ratings yet
Data Analysis: Dr. C Santhosh Kumar
22 pages
Presentation A I STD 2
No ratings yet
Presentation A I STD 2
63 pages
Module 3
No ratings yet
Module 3
41 pages
Clustering and Dimensionality Reduction Techniques PCA T SNE K Means
No ratings yet
Clustering and Dimensionality Reduction Techniques PCA T SNE K Means
15 pages
CS464 Ch6 FeatureExtraction
No ratings yet
CS464 Ch6 FeatureExtraction
46 pages
03 Dimensionality Reduction
No ratings yet
03 Dimensionality Reduction
38 pages
Module 5 - BECE309L - AIML - Part2
No ratings yet
Module 5 - BECE309L - AIML - Part2
34 pages
CHAPTER 6 Dimensionality Reduction
No ratings yet
CHAPTER 6 Dimensionality Reduction
20 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
6 pages
Pca Kmeans GMM
No ratings yet
Pca Kmeans GMM
96 pages
Pages 141-210
No ratings yet
Pages 141-210
70 pages
PCA for Data Scientists
No ratings yet
PCA for Data Scientists
20 pages
20 Pca
No ratings yet
20 Pca
50 pages
PCA for Data Simplification
No ratings yet
PCA for Data Simplification
70 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
OpenStack Install Guide 2024
No ratings yet
OpenStack Install Guide 2024
149 pages
CISA 30 Questions
No ratings yet
CISA 30 Questions
6 pages
2019-IMAS - MP - 1st Round - Sol - ENG - 1111
No ratings yet
2019-IMAS - MP - 1st Round - Sol - ENG - 1111
14 pages
Admissibility of Electronic Evidence Under Bharatiya Sakshya Adhiniyam (New Evidence Act)
No ratings yet
Admissibility of Electronic Evidence Under Bharatiya Sakshya Adhiniyam (New Evidence Act)
7 pages
(Ebook PDF) Adaptive Health Management Information Systems: Concepts, Cases, and Practical Applications 4th Edition Instant Download
0% (1)
(Ebook PDF) Adaptive Health Management Information Systems: Concepts, Cases, and Practical Applications 4th Edition Instant Download
56 pages
HBase and Hive at StumbleUpon Presentation
No ratings yet
HBase and Hive at StumbleUpon Presentation
22 pages
An Analysis On Measuring Graph Patterns in Social Networks
No ratings yet
An Analysis On Measuring Graph Patterns in Social Networks
6 pages
The TPM Playbook - A Step-by-Step Guideline For The Lean Practitioner (PDFDrive)
100% (2)
The TPM Playbook - A Step-by-Step Guideline For The Lean Practitioner (PDFDrive)
33 pages
Prgi User Manual Version 1 Printer
No ratings yet
Prgi User Manual Version 1 Printer
18 pages
NB Vendor Questionnaire v2.4
No ratings yet
NB Vendor Questionnaire v2.4
10 pages
TFVC-CIDSSP (Medyo Final)
No ratings yet
TFVC-CIDSSP (Medyo Final)
16 pages
Comp 4905 Honors Project - Multi-Screen Online Multiplayer Game For An Android Device
No ratings yet
Comp 4905 Honors Project - Multi-Screen Online Multiplayer Game For An Android Device
19 pages
01 Logarithm - Sheet
100% (1)
01 Logarithm - Sheet
20 pages
Excel Guide for Data Analysts
No ratings yet
Excel Guide for Data Analysts
62 pages
Microcomputer OS Course Overview
No ratings yet
Microcomputer OS Course Overview
3 pages
Road Design Template Guide
No ratings yet
Road Design Template Guide
7 pages
Azure Web App Modernization Guide
No ratings yet
Azure Web App Modernization Guide
24 pages
Ethical Perspectives On Hacktivism: The Roles and Actions of Hacker Activist Groups
No ratings yet
Ethical Perspectives On Hacktivism: The Roles and Actions of Hacker Activist Groups
7 pages
Towards Assessing The Maturity of OT Security Control Standards and Guidelines
No ratings yet
Towards Assessing The Maturity of OT Security Control Standards and Guidelines
6 pages
RHEL8 Server Configuration Guide
No ratings yet
RHEL8 Server Configuration Guide
3 pages
MA (Political Science)
No ratings yet
MA (Political Science)
85 pages
Gahana Nepal (7th Sem Project Report)
No ratings yet
Gahana Nepal (7th Sem Project Report)
45 pages
SET-331. Micro Controller Based Refrigeration Control System
No ratings yet
SET-331. Micro Controller Based Refrigeration Control System
4 pages
Gunawan Smart Farm RTOSv 2
No ratings yet
Gunawan Smart Farm RTOSv 2
6 pages
NN & DL Lab Manual 1
No ratings yet
NN & DL Lab Manual 1
44 pages
Foundations of Programming Languages Unknown Download
No ratings yet
Foundations of Programming Languages Unknown Download
89 pages
Cervical Spine Fracture Detection Using Pytorch
No ratings yet
Cervical Spine Fracture Detection Using Pytorch
7 pages
Project & Operations Management 2024
No ratings yet
Project & Operations Management 2024
50 pages
8 Modularization Techniques
100% (2)
8 Modularization Techniques
34 pages
EV Licensing Guide 14.0
No ratings yet
EV Licensing Guide 14.0
21 pages