0% found this document useful (0 votes)

44 views6 pages

ML Module 6

machine learning unitwise pdfs

Uploaded by

Viman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views6 pages

ML Module 6

machine learning unitwise pdfs

Uploaded by

Viman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Module 6

What is Curse of dimensionality? Explain the PCA dimensionality techniques

in detail.

● The "curse of dimensionality" refers to the challenges that arise when

dealing with high-dimensional data.
● As the number of features or dimensions increases, the amount of data
required to effectively cover the feature space grows exponentially.
● This can lead to various problems such as increased computational
complexity, sparsity of data, and difficulty in visualization and
interpretation.

Principal Component Analysis (PCA) is a dimensionality reduction technique

commonly used to address the curse of dimensionality.

1. Data Transformation:
- PCA transforms high-dimensional data into a new coordinate system, where
the axes (principal components) are orthogonal to each other and capture the
maximum variance in the data.

2. Principal Components:
- Principal components are linear combinations of the original features.
- They are ordered by the amount of variance they explain in the data.
- The first principal component captures the most variance, the second captures
the second most, and so on.
3. Dimensionality Reduction:
- PCA retains only a subset of the principal components that capture most of the
variance in the data.
- By selecting a smaller number of principal components, PCA effectively
reduces the dimensionality of the data.

4. Eigenvalue Decomposition:
- PCA uses eigenvalue decomposition to find the principal components.
- It calculates the covariance matrix of the data and then finds the eigenvectors
(principal components) corresponding to the largest eigenvalues.

5. Variance Retention:
- PCA allows users to specify the desired amount of variance to be retained in
the reduced-dimensional space.
- By selecting the appropriate number of principal components, users can
balance the trade-off between dimensionality reduction and information
preservation.

6. Applications:
- PCA is widely used in various fields, including image processing, signal
processing, finance, and genetics.
- It helps in data compression, visualization, noise reduction, and feature
extraction.

Feature Selection and Feature extraction

● Feature Selection:

1. Definition:
- Feature selection involves choosing a subset of the most relevant features from
the original set of features.
- The goal is to improve model performance, reduce overfitting, and enhance
interpretability.
2. Methods:
- Filter Methods: Evaluate the relevance of features independently of the
model.
- Wrapper Methods: Use a specific machine learning model to evaluate the
importance of features.
- Embedded Methods: Feature selection is integrated into the model training
process.

3. Techniques:
- Univariate Selection: Select features based on statistical tests such as
chi-square, ANOVA, or correlation.
- Recursive Feature Elimination (RFE): Iteratively removes the least
important features based on model performance.
- Feature Importance: Uses algorithms like decision trees or random forests to
measure feature importance.

4. Benefits:
- Reduces overfitting by removing irrelevant or redundant features.
- Improves model interpretability and reduces computational complexity.
- Can lead to faster model training and better generalization performance.

● Feature Extraction:

1. Definition:
- Feature extraction involves transforming the original features into a new set of
features that captures the essential information.
- It aims to reduce dimensionality, remove noise, and enhance the representation
of the data.

2. Methods:
- Principal Component Analysis (PCA): Linear transformation to find
orthogonal principal components.
- Linear Discriminant Analysis (LDA): Supervised technique that maximizes
class separability.
- Non-linear Techniques: Kernel PCA, t-distributed Stochastic Neighbor
Embedding (t-SNE), autoencoders.
3. Techniques:
- PCA: Projects data onto a lower-dimensional space while maximizing
variance.
- LDA: Finds the linear combination of features that best separates different
classes.
- Non-linear Techniques: Capture complex relationships in the data that linear
methods cannot.

4. Benefits:
- Reduces dimensionality, which can lead to faster computation and improved
model performance.
- Enhances the representation of the data by capturing underlying patterns or
structures.
- Helps in visualizing high-dimensional data and understanding its underlying
characteristics.
Feature Selection Feature Extraction

Selects a subset of relevant Extracts a new set of features

1.
features from the original set of that are more informative and
features. compact.

Captures the essential

Reduces the dimensionality of
2. information from the original
the feature space and simplifies
features and represents it in a
the model.
lower-dimensional feature space.

Can be categorized into filter,

3. Can be categorized into linear
wrapper, and embedded
and nonlinear methods.
methods.

4. Requires domain knowledge and Can be applied to raw data

feature engineering. without feature engineering.

Can improve the model’s Can improve the model

5.
interpretability and reduce performance and handle
overfitting. nonlinear relationships.

May lose some information and May introduce some noise and
6.
introduce bias if the wrong redundancy if the extracted
features are selected. features are not informative.
Principal Component Analysis(PCA)
1. Purpose:
- PCA is a technique used for dimensionality reduction.
- It transforms high-dimensional data into a lower-dimensional space while
preserving the most important information.

2. Process:
- PCA identifies the directions (principal components) in which the data varies
the most.
- It projects the data onto these principal components, effectively reducing the
dimensionality.

3. Principal Components:
- Principal components are orthogonal vectors that capture the maximum
variance in the data.
- The first principal component explains the most variance, the second explains
the second most, and so on.

4. Mathematical Calculation:
- PCA calculates the covariance matrix of the data.
- It then finds the eigenvectors (principal components) corresponding to the
largest eigenvalues of the covariance matrix.

5. Dimensionality Reduction:
- PCA retains only the top (k) principal components that capture most of the
variance in the data.
- This reduces the dimensionality of the data from (n) dimensions to (k)
dimensions ((k < n).

6. Applications:
- PCA is widely used in data preprocessing, feature extraction, and visualization.
- It helps in reducing noise, speeding up machine learning algorithms, and
improving interpretability.

PCA in Machine Learning Explained
No ratings yet
PCA in Machine Learning Explained
33 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
2 pages
Dimensionality Reduction Technique
No ratings yet
Dimensionality Reduction Technique
17 pages
Love Report 1
No ratings yet
Love Report 1
10 pages
Mod2 Dimensionality Reduction
No ratings yet
Mod2 Dimensionality Reduction
18 pages
Pca 1
No ratings yet
Pca 1
3 pages
Module 3
No ratings yet
Module 3
41 pages
U5@-Data Reduction
No ratings yet
U5@-Data Reduction
22 pages
Pages 141-210
No ratings yet
Pages 141-210
70 pages
Love Report
No ratings yet
Love Report
7 pages
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
No ratings yet
1.variable Reduction 2.principal Component Analysis: Topic UNIT-4
19 pages
What Is PCA?: Image Source
No ratings yet
What Is PCA?: Image Source
17 pages
ML Mod 4 Part 2
No ratings yet
ML Mod 4 Part 2
32 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
27 pages
Linear Algebra
No ratings yet
Linear Algebra
5 pages
Pca 2
No ratings yet
Pca 2
3 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
3 pages
ML (Unit 5)
No ratings yet
ML (Unit 5)
34 pages
Lesson 7-Feature Selection and Principal Component Analysis
No ratings yet
Lesson 7-Feature Selection and Principal Component Analysis
24 pages
Ai (PCA)
No ratings yet
Ai (PCA)
3 pages
Unit 3
No ratings yet
Unit 3
102 pages
Data Reduction
No ratings yet
Data Reduction
9 pages
3.2 Pca
No ratings yet
3.2 Pca
27 pages
Unit 3
No ratings yet
Unit 3
31 pages
ML 5
No ratings yet
ML 5
2 pages
PCA for Banking Multicollinearity
No ratings yet
PCA for Banking Multicollinearity
5 pages
Feature Extraction Techniques
No ratings yet
Feature Extraction Techniques
32 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
19 pages
Principal Component Analysis1
No ratings yet
Principal Component Analysis1
26 pages
Dimensionality Reduction, PCA, and Kernel Methods
No ratings yet
Dimensionality Reduction, PCA, and Kernel Methods
3 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
11 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
1 page
ML Mod 4 & 6 Pyq
No ratings yet
ML Mod 4 & 6 Pyq
11 pages
PCA Theory
No ratings yet
PCA Theory
13 pages
PCA Using Python
No ratings yet
PCA Using Python
18 pages
Principal Component Analysis and Cluster Analysis
No ratings yet
Principal Component Analysis and Cluster Analysis
14 pages
03 Principal Components Analysis
No ratings yet
03 Principal Components Analysis
3 pages
Pca 1692550768
No ratings yet
Pca 1692550768
13 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
6 pages
Pattern Recognition Techniques
No ratings yet
Pattern Recognition Techniques
13 pages
UNIT-4 Machine Learning
No ratings yet
UNIT-4 Machine Learning
20 pages
PCA for Dimensionality Reduction
No ratings yet
PCA for Dimensionality Reduction
36 pages
PCA for Students and Educators
No ratings yet
PCA for Students and Educators
12 pages
Pca - Principal Component Analysis 1233
No ratings yet
Pca - Principal Component Analysis 1233
30 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
Implementation of Dimensionality Reduction Techniques in Hospital Management
No ratings yet
Implementation of Dimensionality Reduction Techniques in Hospital Management
4 pages
The Intuition Behind PCA: Machine Learning Assignment
No ratings yet
The Intuition Behind PCA: Machine Learning Assignment
11 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
Unit 4 - ML (NEW)
No ratings yet
Unit 4 - ML (NEW)
80 pages
CHBE413CDS Lecture 12 Unsupervised DimRed
No ratings yet
CHBE413CDS Lecture 12 Unsupervised DimRed
30 pages
Pca Lda Lobo
No ratings yet
Pca Lda Lobo
20 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
30 pages
Dimensionality Reduction Guide
No ratings yet
Dimensionality Reduction Guide
15 pages
Module 3 ML
No ratings yet
Module 3 ML
19 pages
PCA Dev
No ratings yet
PCA Dev
16 pages
Linear Programming Problem Solutions
No ratings yet
Linear Programming Problem Solutions
14 pages
Class 10th (Set B) APS Maths Exam 15 (8 Copy)
No ratings yet
Class 10th (Set B) APS Maths Exam 15 (8 Copy)
1 page
DL
No ratings yet
DL
2 pages
Unit 2: Numerical Methods Branches: EEE/ECE/TCE/ML/IT: Solution of Algebraic and Transcendental Equations
100% (2)
Unit 2: Numerical Methods Branches: EEE/ECE/TCE/ML/IT: Solution of Algebraic and Transcendental Equations
10 pages
Minimum Coin Change Algorithm
No ratings yet
Minimum Coin Change Algorithm
3 pages
Lecture1 Asymptotic Anal
No ratings yet
Lecture1 Asymptotic Anal
74 pages
Mid 2-1
No ratings yet
Mid 2-1
2 pages
Fractional Knapsack for CS Students
No ratings yet
Fractional Knapsack for CS Students
2 pages
Algorithm Analysis for CS Students
No ratings yet
Algorithm Analysis for CS Students
18 pages
05 Greedy Clustering
No ratings yet
05 Greedy Clustering
6 pages
Flow Shop Scheduling Optimization
No ratings yet
Flow Shop Scheduling Optimization
8 pages
Chinese Remainder Theorem
No ratings yet
Chinese Remainder Theorem
16 pages
Assignment: Find The Value of F (3.5) and F (5.5) Using Lagrange's Interpolation of Given Data
No ratings yet
Assignment: Find The Value of F (3.5) and F (5.5) Using Lagrange's Interpolation of Given Data
2 pages
Lecture 5 - Solving Systems of Linear Equations (Gauss-Jordan Elimination Method)
No ratings yet
Lecture 5 - Solving Systems of Linear Equations (Gauss-Jordan Elimination Method)
8 pages
Production Decisions at Harding Silicon Enterprises Inc Answer.
No ratings yet
Production Decisions at Harding Silicon Enterprises Inc Answer.
4 pages
LinearProgramming Russell Taylor
100% (1)
LinearProgramming Russell Taylor
27 pages
CG L7 DDA Line Drawing Algorithm
No ratings yet
CG L7 DDA Line Drawing Algorithm
18 pages
Artificial Neural Network in Matlab: Hany Ferdinando
No ratings yet
Artificial Neural Network in Matlab: Hany Ferdinando
13 pages
Advanced Counting Techniques
No ratings yet
Advanced Counting Techniques
31 pages
MCQ'S
No ratings yet
MCQ'S
17 pages
Interpolation and Smoothing Techniques
No ratings yet
Interpolation and Smoothing Techniques
37 pages
LP - Duality - 5
No ratings yet
LP - Duality - 5
40 pages
Cal 11 Q3 0305 Final
No ratings yet
Cal 11 Q3 0305 Final
22 pages
Factorisation Worksheet Class 8TH
100% (1)
Factorisation Worksheet Class 8TH
4 pages
Week 3 Programming Assignment 2: Expected Learning Outcomes From This Assignment
No ratings yet
Week 3 Programming Assignment 2: Expected Learning Outcomes From This Assignment
2 pages
3 Mark Type (Polynomials)
No ratings yet
3 Mark Type (Polynomials)
9 pages
Thank You For Taking The Week 3: Assignment 3. Week 3: Assignment 3
No ratings yet
Thank You For Taking The Week 3: Assignment 3. Week 3: Assignment 3
3 pages
Transforms and Numerical Methods (MA11002)
No ratings yet
Transforms and Numerical Methods (MA11002)
3 pages
Data Structures & Algorithms Quiz
No ratings yet
Data Structures & Algorithms Quiz
65 pages
2.application of Matrices
No ratings yet
2.application of Matrices
19 pages

ML Module 6

Uploaded by

ML Module 6

Uploaded by

Module 6

What is Curse of dimensionality? Explain the PCA dimensionality techniques

● The "curse of dimensionality" refers to the challenges that arise when

Principal Component Analysis (PCA) is a dimensionality reduction technique

Feature Selection and Feature extraction

Selects a subset of relevant Extracts a new set of features

Captures the essential

Can be categorized into filter,

4. Requires domain knowledge and Can be applied to raw data

Can improve the model’s Can improve the model

You might also like