Data-Science Feature Selection & Extraction

The document discusses feature engineering in data science, focusing on feature selection and extraction techniques. It outlines the importance of reducing input variables to improve model performance, interpretability, and accuracy while detailing various methods such as filter, wrapper, embedded, and hybrid approaches. Additionally, it distinguishes between supervised and unsupervised feature selection techniques, highlighting their respective methodologies and applications.

Uploaded by

Mrclub 3Money

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views15 pages

Data-Science Feature Selection & Extraction

Uploaded by

Mrclub 3Money

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 15

DATA SCIENCE & BIG DATA

ANALYTICS

FEATURE ENGINEERING
(Feature Selection & Extraction)

Dr. S. N. Ahsan
Feature Selection
 Feature selection is the process of reducing the number of
input variables when developing a predictive model.
 Feature selection is the process of selecting a subset of most
relevant predicting features for use in machine learning model
building.
 Feature elimination helps a model to perform better by
weeding out redundant features and features that are not
providing much insight.
 It is economical in computing power because there are fewer
features to train on. Results are more interpretable, and it
reduces chance of overfitting by detecting collinear features
and improves model accuracy if methods are used
intelligently.
Feature Extraction
 Feature Extraction aims to reduce the number of
features in a dataset by creating new features from the
existing ones (and then discarding the original features).
 These new reduced set of features should then be able
to summarize most of the information contained in the
original set of features. In this way, a summarized
version of the original features can be created from a
combination of the original set.
Supervised and Unsupervised
Feature Extraction
High-Level Taxonomy for Feature Engineering
Extended Taxonomy of Supervised
Feature Selection Methods
Feature Selection Categories
Supervised & Unsupervised Feature
Selection
 Supervised feature selection techniques use the target
variable, such as methods that remove irrelevant
variables..
 Unsupervised feature selection techniques ignores the
target variable, such as methods that remove redundant
variables using correlation.
General Frameworks of Supervised (a)
and Unsupervised (b) Feature Selection
Feature Selection Methods
Filter Method
 In the Filter, method features are selected based on
statistical measures. It is independent of the learning
algorithm and requires less computational time.
Information gain, chi-square test, Fisher score,
correlation coefficient, and variance threshold are some
of the statistical measures used to understand the
importance of the features.
 This method should be used for preliminary screening. It
can detect constant, duplicated, and correlated features.
Usually not the best performance in terms of reducing
features. Being said that, It should be the first step for
feature reduction as it deals with multicollinearity of the
features depending on method used.
Wrapper Method
 The Wrapper methodology considers the selection of
feature sets as a search problem, where different
combinations are prepared, evaluated, and compared to
other combinations. A predictive model is used to
evaluate a combination of features and assign model
performance scores.
 The performance of the Wrapper method depends on
the classifier. The best subset of features is selected
based on the results of the classifier.
 Wrapper methods are computationally more expensive
than filter methods, due to the repeated learning steps
and cross-validation. However, these methods are more
accurate than the filter method. Some of the examples
are Recursive feature elimination, Sequential feature
selection algorithms, and Genetic algorithms.
Embedded Method
 In the Embedded method, there are ensemble learning
and hybrid learning methods for feature selection. Since
it has a collective decision, its performance is better than
the other two models. Random forest is one such
example. It is computationally less intensive than
wrapper methods. However, this method has a drawback
specific to a learning model.
 In embedded techniques, the feature selection algorithm
is integrated as part of the learning algorithm. The most
typical embedded technique is the decision tree
algorithm. Decision tree algorithms select a feature in
each recursive step of the tree growth process and
divide the sample set into smaller subsets.
Hybrid Method
 The process of creating hybrid feature selection methods
depends on what you choose to combine. The main
priority is to select the methods you’re going to use, then
follow their processes.
 The idea here is to use these ranking methods to
generate a feature ranking list in the first step, then use
the top k features from this list to perform wrapper
methods. With that, we can reduce the feature space of
our dataset using these filter-based rangers to improve
the time complexity of the wrapper methods.
Extended Taxonomy of Unsupervised
Feature Selection Methods

Feature Selection & Extraction
No ratings yet
Feature Selection & Extraction
15 pages
International Journal of Engineering Research and Development (IJERD)
No ratings yet
International Journal of Engineering Research and Development (IJERD)
5 pages
Feature Selection in Machine Learning
No ratings yet
Feature Selection in Machine Learning
9 pages
Feature Selection in Machine Learning
No ratings yet
Feature Selection in Machine Learning
5 pages
Feature Selection Techniques
No ratings yet
Feature Selection Techniques
5 pages
Introduction To Feature Selection Methods With An Example
No ratings yet
Introduction To Feature Selection Methods With An Example
10 pages
Feature Selection in PR
No ratings yet
Feature Selection in PR
6 pages
Feature Selection Technique
No ratings yet
Feature Selection Technique
7 pages
Feature Selection
No ratings yet
Feature Selection
6 pages
Feature Selection
No ratings yet
Feature Selection
18 pages
Feature Engineering Essentials
No ratings yet
Feature Engineering Essentials
29 pages
Lecture#10
No ratings yet
Lecture#10
24 pages
An Introduction To Feature Selection
No ratings yet
An Introduction To Feature Selection
45 pages
Dimensionality Reduction Guide
No ratings yet
Dimensionality Reduction Guide
24 pages
Module-3 DSV
No ratings yet
Module-3 DSV
20 pages
Chandra Shekar 2014
No ratings yet
Chandra Shekar 2014
13 pages
Presentation 1
No ratings yet
Presentation 1
22 pages
ML Module VI
No ratings yet
ML Module VI
24 pages
Feature Selection Tech
No ratings yet
Feature Selection Tech
5 pages
Feature Selection in Machine Learning
No ratings yet
Feature Selection in Machine Learning
4 pages
Featuere Selection
No ratings yet
Featuere Selection
5 pages
Feature Selection - New
No ratings yet
Feature Selection - New
41 pages
Wrapper Method
No ratings yet
Wrapper Method
58 pages
Feature Selection Techniques in Machine Learning - Javatpoint
No ratings yet
Feature Selection Techniques in Machine Learning - Javatpoint
9 pages
Shap-Select:: Lightweight Feature Selection Using SHAP Values and Regression
No ratings yet
Shap-Select:: Lightweight Feature Selection Using SHAP Values and Regression
13 pages
Lecture 15 - 23.09.2024 - Feature Selection
No ratings yet
Lecture 15 - 23.09.2024 - Feature Selection
47 pages
Feature Selection
No ratings yet
Feature Selection
2 pages
Unit 3
No ratings yet
Unit 3
50 pages
AI5003 AML Week07
No ratings yet
AI5003 AML Week07
14 pages
Literature Review On Feature Subset Selection Techniques
No ratings yet
Literature Review On Feature Subset Selection Techniques
3 pages
3038-Article Text-5729-1-10-20210418
No ratings yet
3038-Article Text-5729-1-10-20210418
6 pages
Eature Engineering: Presenter: Prof. Amit Kumar Das
No ratings yet
Eature Engineering: Presenter: Prof. Amit Kumar Das
17 pages
Elaboudi 2016
No ratings yet
Elaboudi 2016
5 pages
Filter Based Feature Selection Using ANOVA: Suppose A Company Wants To Analyze Whether The
No ratings yet
Filter Based Feature Selection Using ANOVA: Suppose A Company Wants To Analyze Whether The
66 pages
ML Lecture 02
No ratings yet
ML Lecture 02
40 pages
3b Features PDF
No ratings yet
3b Features PDF
40 pages
Unit 3
No ratings yet
Unit 3
55 pages
Module-3 - DS (Autosaved)
No ratings yet
Module-3 - DS (Autosaved)
18 pages
Feature Selection
No ratings yet
Feature Selection
13 pages
Data Prep For ML-1
No ratings yet
Data Prep For ML-1
5 pages
کتاب پنجم بارگزاری شده
No ratings yet
کتاب پنجم بارگزاری شده
35 pages
Feature Selection Techniques For Microarray Dataset: A Review
No ratings yet
Feature Selection Techniques For Microarray Dataset: A Review
8 pages
MCA Class Note Feature
No ratings yet
MCA Class Note Feature
5 pages
Presentation 1
No ratings yet
Presentation 1
15 pages
Filter Methods
No ratings yet
Filter Methods
6 pages
Kernels, Model & Feature Selection
No ratings yet
Kernels, Model & Feature Selection
5 pages
7 Selectia Trasaturilor
No ratings yet
7 Selectia Trasaturilor
54 pages
6 - Data Pre-Processing-III
No ratings yet
6 - Data Pre-Processing-III
30 pages
Module5.2 Feature Selection Methods
No ratings yet
Module5.2 Feature Selection Methods
64 pages
Survey 2006
No ratings yet
Survey 2006
15 pages
Feature Selection
No ratings yet
Feature Selection
18 pages
Feature Engineering
No ratings yet
Feature Engineering
5 pages
Fast Clustering Based Feature Selection: Ubed S. Attar, Ajinkya N. Bapat, Nilesh S. Bhagure, Popat A. Bhesar
No ratings yet
Fast Clustering Based Feature Selection: Ubed S. Attar, Ajinkya N. Bapat, Nilesh S. Bhagure, Popat A. Bhesar
7 pages
Lua Chon Dac Trung
No ratings yet
Lua Chon Dac Trung
18 pages
Feature Engg Pre Processing Python
No ratings yet
Feature Engg Pre Processing Python
68 pages
Feature Selection for Data Scientists
No ratings yet
Feature Selection for Data Scientists
37 pages
Feature Selection and Extraction
No ratings yet
Feature Selection and Extraction
5 pages
Internet Bill March
No ratings yet
Internet Bill March
1 page
RA 10654 (Amendment)
No ratings yet
RA 10654 (Amendment)
32 pages
Specifications and Repair Procedures For C4.4 Cylinder Blocks
No ratings yet
Specifications and Repair Procedures For C4.4 Cylinder Blocks
8 pages
Bill of Quantities
No ratings yet
Bill of Quantities
36 pages
SR03 01Mk2 Datasheet Rev4
No ratings yet
SR03 01Mk2 Datasheet Rev4
4 pages
1 Labor Supply, Population Growth, Wages
100% (6)
1 Labor Supply, Population Growth, Wages
22 pages
Datasheet - DT50-P2113 - 1047314 - en - Sick
No ratings yet
Datasheet - DT50-P2113 - 1047314 - en - Sick
6 pages
4 Sheet Metal
No ratings yet
4 Sheet Metal
54 pages
DR Reddy Lab 5 Year Data
No ratings yet
DR Reddy Lab 5 Year Data
4 pages
2A PC Connection of CelciuX To CX-Thermo USB
No ratings yet
2A PC Connection of CelciuX To CX-Thermo USB
1 page
Ch5 Admission of A Partner Q41 60
No ratings yet
Ch5 Admission of A Partner Q41 60
35 pages
AIA - LA Welcomes Rob Hollman As CALA Campaign Director
No ratings yet
AIA - LA Welcomes Rob Hollman As CALA Campaign Director
2 pages
06 - CCS-Module-2.5 BECCS Final
No ratings yet
06 - CCS-Module-2.5 BECCS Final
46 pages
Basic Knowldges of Pentesting 1
No ratings yet
Basic Knowldges of Pentesting 1
289 pages
One Word Substitution Questions
No ratings yet
One Word Substitution Questions
70 pages
MaSh Marketing Compendium
No ratings yet
MaSh Marketing Compendium
30 pages
F2 Chapter 10 (Foreign Currency Transactions)
No ratings yet
F2 Chapter 10 (Foreign Currency Transactions)
5 pages
Taxation PPT Final-Sneha.v
No ratings yet
Taxation PPT Final-Sneha.v
16 pages
Harga Injeksi Generik PT IPHA
No ratings yet
Harga Injeksi Generik PT IPHA
1 page
Thinking About Business Ethics
No ratings yet
Thinking About Business Ethics
30 pages
Account STMT
No ratings yet
Account STMT
1 page
H.N. Holdings Corporate Profile
No ratings yet
H.N. Holdings Corporate Profile
11 pages
Russian Politics in Exile The Northeast Asian Balance of Power 19241931 Felix Patrikeeff Download
100% (1)
Russian Politics in Exile The Northeast Asian Balance of Power 19241931 Felix Patrikeeff Download
77 pages
Technical Meeting No.57
No ratings yet
Technical Meeting No.57
10 pages
BR SprayMaster
No ratings yet
BR SprayMaster
16 pages
Into vs. Valle
No ratings yet
Into vs. Valle
2 pages
Jaykosai CV Latest
No ratings yet
Jaykosai CV Latest
1 page
Precision Tune Auto Care
No ratings yet
Precision Tune Auto Care
3 pages
Mini Project Semester 2
No ratings yet
Mini Project Semester 2
34 pages

Data-Science Feature Selection & Extraction

Uploaded by

Data-Science Feature Selection & Extraction

Uploaded by

DATA SCIENCE & BIG DATA

You might also like