0% found this document useful (0 votes)

4 views31 pages

Classification Algorithms

The document provides an overview of performance measurement tools for machine learning models, specifically focusing on the Confusion Matrix and key performance parameters such as Accuracy, Precision, Recall, and F1 Score. It also discusses various machine learning algorithms, particularly K-Nearest Neighbors (KNN) and Decision Trees, detailing their working mechanisms, splitting methods, and evaluation metrics like Gini Impurity and Information Gain. Additionally, it introduces ensemble techniques like Bagging and Boosting, with a focus on Random Forest as a specific Bagging method to enhance predictive accuracy.

Uploaded by

Manish Kumar Lakhiwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views31 pages

Classification Algorithms

Uploaded by

Manish Kumar Lakhiwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 31

Confusion Matrix 1

Confusion Matrix is a performance measurement tool for machine learning models.

It is used to visualize the performance of a classification algorithm. Basic explanation of Confusion Matrix is
shown in figure 12.

Fig.12 Confusion Metrix

Performance Parameters 2

All performance parameters, descriptions and their equations shown below.

Accuracy:
Accuracy represents the number of correctly classified data instances over the total number of data instances:

Precision:
A classification model's capability to determine only the most pertinent information points. Precision is
defined mathematically as shown below in equation 4.3.
Performance Parameters 3

Recall:
An algorithm's capability to identify every applicable class within a data collection. We define recall statistically
as shown below:

F1 Score:
As illustrated below, the F1 score is determined as the harmonic average of the recall and precision scores. It
goes from 0 to 100%, with an elevated F1 score indicating a higher quality classifier.
Machine Learning Algorithms 4
AI: Artificial Intelligence is technique that can do its tasks without any Human interaction

Machine Learning is basically a part of AI that provides Statistic

AI
tools to Analyze and Visualize data to predict a model.

DL Figure 10 shows algorithms that are used in this work.

Fig.10 Used Machine Learning Algorithms

KNN 5

The K-Nearest Neighbors (K-NN) algorithm is a popular Machine

Learning algorithm used mostly for solving classification problems.
Working of KNN:
The K-NN algorithm compares a new data entry to the values in a
given data set (with different classes or categories).
Based on its closeness or similarities in a given range (K) of neighbors,
the algorithm assigns the new data to a class or category in the data set
(training data).
KNN 6
Step #1 - Load Data and Assign a value to K.
Step #2 - Calculate the distance between the new data entry and all other existing data entries (you'll learn
how to do this shortly). Arrange them in ascending order.
Step #3 - Find the K nearest neighbors to the new entry based on the calculated distances.
Step #4 - Assign the new data entry to the majority class in the nearest neighbors.
How do I choose K?
Selecting the optimal value of k depends on the characteristics of the input data. If the dataset has
significant outliers or noise a higher k can help smooth out the predictions and reduce the influence of
noisy data. However, choosing very high value can lead to underfitting where the model becomes too
simplistic.
How does KNN work? KNN 7
1. Euclidean Distance
We usually use Euclidean distance to calculate the nearest neighbor. If we have two points (x, y) and (a, b).
The formula for Euclidean distance (d) will be
d
2. Manhattan Distance
This is the total distance you would travel if you could only move along horizontal and vertical lines (like a
grid or city streets). It’s also called “taxicab distance” because a taxi can only drive along the grid-like streets
of a city.
KNN 8
KNN 9
Decision Tree 10

 Decision tree-based models use training data to derive rules that are used to predict an output.
 Decision tree builds classification or regression model in form of a tree structure
 It Breaks down a data set into smaller and smaller subsets while at the same time an associated decision tree
is inclemently developed
 The final result is the tree with decision nodes and leaf nodes.
 Decision nodes can have two or more branches.
 Leaf note represents a classification or decision.
 The top most decision node in a tree which corresponds to the best predictor called root node.
 Decision tree can handle both categorical and numerical data.
Working of Decision Tree 11
DT works on 2 basic parameters called (Entropy and Gini Index), Information Gain
 Entropy = -P+ log2 P+ - P- log2 P- (Entropy is a measurement of Randomness)
 Entropy Ranges from 0 to 1.
 Gini Index: The Gini Index, also known as Impurity, calculates the likelihood that somehow a randomly
picked instance would be erroneously cataloged.
 Gini Index = 1-[(P+)2 + (P-)2 ]
 GI ranges from 0 to 0.5

 P+ = Probability of True values

 P- = Probability of False values Fig. 11 Entropy vs Gini Impurity

Working of Decision Tree 12
Information Gain:
Measures the reduction of Entropy before and after a split on a subset S using the function F.

1. E(S): The current entropy on our subset S, before any split

2. |S|: The size or the number of instances in S
3. A: An attribute in S that has a given set of values (Let’s say it is a discrete attribute)
4. v: Stands for value and represents each value of the attribute A
5. Sv: After splitting S using A, Sv refers to each of the resulted subsets from S, that share the same value in A

6. E(Sv): The entropy of a subset Sv . This should be computed for each value of A (assuming it is a discrete
attribute)
Working of Decision Tree 13

I
= .2464

vjh
Working of Decision Tree 14

I
= .0289
Working of Decision Tree 15
Working of Decision Tree 16
Working of Decision Tree 17
Decision Trees 18
Basic Decision Tree Terminologies

•Parent and Child Node: A node that gets divided into sub-
nodes is known as Parent Node, and these sub-nodes are known
as Child Nodes. Since a node can be divided into multiple sub-
nodes, it can act as a parent node of numerous child nodes.
•Root Node: The topmost node of a decision tree. It does not
have any parent node. It represents the entire population or
sample.
•Leaf / Terminal Nodes: Nodes of the tree that do not have any
child node are known as Terminal/Leaf Nodes.
Decision Trees 19

There are multiple tree models to choose from based on their

learning technique when building a decision tree, e.g., ID3,
CART, Classification and Regression Tree, C4.5, etc. Selecting
which decision tree to use is based on the problem statement. For
example, for classification problems, we mainly use a
classification tree with a gini index to identify class labels for
datasets with relatively more number of classes.
Decision Trees 20

Node splitting, or simply splitting, divides a node into multiple sub-

nodes to create relatively pure nodes. This is done by finding the best
split for a node and can be done in multiple ways. The ways of splitting a
node can be broadly divided into two categories based on the type of
target variable:
1.Continuous Target Variable: Reduction in Variance
2.Categorical Target Variable: Gini Impurity, Information Gain, and
Chi-Square
Reduction in Variance in Decision Tree 21

Reduction in Variance is a method for splitting the node used when the
target variable is continuous, i.e., regression problems. It is called so
because it uses variance as a measure for deciding the feature on which a
node is split into child nodes.
variance reduction in variance
Variance is used for calculating the homogeneity of a node. If a node is
entirely homogeneous, then the variance is zero.
Reduction in Variance in Decision Tree 22

Here are the steps to split a decision tree using the reduction in variance
method:
1. For each split, individually calculate the variance of each child node
2. Calculate the variance of each split as the weighted average variance
of child nodes
3. Select the split with the lowest variance.
4. Perform steps 1-3 until completely homogeneous nodes are achieved
Information Gain in Decision Tree 23

Now, what if we have a categorical target variable? For categorical variables, a reduction in
variation won’t quite cut it. Well, the answer to that is Information Gain. The Information Gain
method is used for splitting the nodes when the target variable is categorical. It works on the
concept of entropy and is given by:
Information gain
Entropy is used for calculating the purity of a node. The lower the value of entropy, the higher the
purity of the node. The entropy of a homogeneous node is zero. Since we subtract entropy from 1,
the Information Gain is higher for the purer nodes with a maximum value of 1. Now, let’s take a
look at the formula for calculating the entropy:
Information Gain in Decision Tree 24

Steps to split a decision tree using Information Gain:

1.For each split, individually calculate the entropy of each child node
2.Calculate the entropy of each split as the weighted average entropy of
child nodes
3.Select the split with the lowest entropy or highest information gain
4.Until you achieve homogeneous nodes, repeat steps 1-3
Information Gain in Decision Tree 25

Steps to split a decision tree using Information Gain:

1. For each split, individually calculate the entropy of each child node
2. Calculate the entropy of each split as the weighted average entropy of
child nodes
3. Select the split with the lowest entropy or highest information gain
4. Until you achieve homogeneous nodes, repeat steps 1-3
Gini Impurity in Decision Tree 26

Gini Impurity is a method for splitting the nodes when the target variable is categorical.
It is the most popular and easiest way to split a decision tree. The Gini Impurity value
is:
What is Gini?
Gini is the probability of correctly labeling a randomly chosen element if it is randomly
labeled according to the distribution of labels in the node.
The formula for Gini is: And Gini
Impurity is:

The lower the Gini Impurity, the higher the homogeneity of the node. The Gini Impurity of a pure node is
0.
Gini Impurity in Decision Tree 27

Here are the steps to split a decision tree using Gini Impurity:
1. Similar to what we did in information gain. For each split,
individually calculate the Gini Impurity of each child node
2. Calculate the Gini Impurity of each split as the weighted average
Gini Impurity of child nodes
3. Select the split with the lowest value of Gini Impurity
4. Until you achieve homogeneous nodes, repeat steps 1-3
Chi-Square in Decision Tree 28
Chi-square is another method of splitting nodes in a decision tree for datasets having categorical
target values. It is used to make two or more splits in a node. It works on the statistical significance
of differences between the parent node and child nodes.
The Chi-Square value is:
Here, the Expected is the expected value for a class in a child node based on the distribution of
classes in the parent node, and the Actual is the actual value for a class in a child node.
The above formula gives us the value of Chi-Square for a class. Take the sum of Chi-Square values
for all the classes in a node to calculate the Chi-Square for that node. The higher the value, the
higher will be the differences between parent and child nodes, i.e., the higher will be the
homogeneity.
Chi-Square in Decision Tree 29

Here are the steps to split a decision tree using Chi-Square:

1. For each split, individually calculate the Chi-Square value of each
child node by taking the sum of Chi-Square values for each class in a
node
2. Calculate the Chi-Square value of each split as the sum of Chi-
Square values for all the child nodes
3. Select the split with a higher Chi-Square value
4. Until you achieve homogeneous nodes, repeat steps 1-3
Ensemble Techniques 30
We have two types of Ensemble Techniques:

Bagging Boosting
Bagging: Creating a different training subset from Boosting: Combing weak learners into strong
sample training data with replacement is called learners by creating sequential models such that the
Bagging. The final output is based on majority voting final model has the highest accuracy is called
In bagging we have 1 Algorithm : Boosting.
In Boosting we have 3 Algorithms:

• Random Forest • AdaBoost

• Gradient Boost
• XGBoost
Random Forest 31

Random Forest is basically a type of Bagging Technique. d= #of Samples in Dataset

Random Forest is a classifier that contains a number of f= #of Features in Dataset
decision trees on various subsets of the given dataset and takes RS= Row Sampling
the average to improve the predictive accuracy of that FS= Feature Sampling
dataset." Instead of relying on one decision tree, the random
forest takes the prediction from each tree and based on the
majority votes of predictions, and it predicts the final output.
The greater number of trees in the forest leads to higher
accuracy and prevents the problem of overfitting.
Figure 11 shows working of RF algorithm.
Fig.11 Working of Random Forest

Unit 1 ML (NN& ML Techniques)
No ratings yet
Unit 1 ML (NN& ML Techniques)
40 pages
Unit 1 ML (DT)
No ratings yet
Unit 1 ML (DT)
24 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
25 pages
ML Unit-III
No ratings yet
ML Unit-III
30 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
11 pages
UNIT - 3 ML
No ratings yet
UNIT - 3 ML
24 pages
Unit-4 (1) .Docx ML
No ratings yet
Unit-4 (1) .Docx ML
42 pages
ML-PPT Unit Iii-1
No ratings yet
ML-PPT Unit Iii-1
38 pages
ML Unit 2 Final - III Yr
No ratings yet
ML Unit 2 Final - III Yr
72 pages
Lecture Material 12
No ratings yet
Lecture Material 12
9 pages
Supervised Decision TreeRandom Forest
No ratings yet
Supervised Decision TreeRandom Forest
39 pages
L04 Decision Trees
No ratings yet
L04 Decision Trees
34 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Unit 3 (MLT)
No ratings yet
Unit 3 (MLT)
42 pages
ML (Interview)
No ratings yet
ML (Interview)
20 pages
Lecture 7.1 - Decision Tree Classification
No ratings yet
Lecture 7.1 - Decision Tree Classification
15 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
43 pages
Data Science Lectures 3
No ratings yet
Data Science Lectures 3
46 pages
DM Unit 4
No ratings yet
DM Unit 4
24 pages
1.decision Trees Concepts
No ratings yet
1.decision Trees Concepts
70 pages
Decision Trees
No ratings yet
Decision Trees
61 pages
ML Lecture 8 9 Classification
No ratings yet
ML Lecture 8 9 Classification
35 pages
Unit 3.2 Decision Tree Algorithm Wit Examples
No ratings yet
Unit 3.2 Decision Tree Algorithm Wit Examples
85 pages
Unit 3 (A) NGP
No ratings yet
Unit 3 (A) NGP
78 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Classification Algorithms: Inteligência Artificial E Cibersegurança (Inacs)
No ratings yet
Classification Algorithms: Inteligência Artificial E Cibersegurança (Inacs)
60 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
Adobe Scan 16 May 2023
No ratings yet
Adobe Scan 16 May 2023
12 pages
Unit 3
No ratings yet
Unit 3
21 pages
Decision Tree
No ratings yet
Decision Tree
19 pages
Unit 2
No ratings yet
Unit 2
29 pages
Decision Tree
No ratings yet
Decision Tree
35 pages
UNIT - 3 ML
No ratings yet
UNIT - 3 ML
24 pages
Decision Tree - ML Class
No ratings yet
Decision Tree - ML Class
58 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Decision Tree Classification Guide
No ratings yet
Decision Tree Classification Guide
3 pages
07.2.decision Trees
No ratings yet
07.2.decision Trees
33 pages
Decision Tree
No ratings yet
Decision Tree
8 pages
Decision Tree: "For Each Node of The Tree, The Information Value Measures
No ratings yet
Decision Tree: "For Each Node of The Tree, The Information Value Measures
3 pages
S&ML Unit 6 - Q & A
No ratings yet
S&ML Unit 6 - Q & A
12 pages
06-Classification Part1
No ratings yet
06-Classification Part1
44 pages
Training Day 22
No ratings yet
Training Day 22
48 pages
ML Unit 3
No ratings yet
ML Unit 3
22 pages
ML Unit II
No ratings yet
ML Unit II
183 pages
Session 5b Classification by Decision Tree Induction
No ratings yet
Session 5b Classification by Decision Tree Induction
42 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
Data Mining Chapter 3 Classification
No ratings yet
Data Mining Chapter 3 Classification
82 pages
DECISION TREES-jb
No ratings yet
DECISION TREES-jb
8 pages
Chapter 4 SqCzYr
No ratings yet
Chapter 4 SqCzYr
47 pages
DS Tech M 3 1
No ratings yet
DS Tech M 3 1
13 pages
Decision Tree - Notes
No ratings yet
Decision Tree - Notes
8 pages
Module 4 Lecture - 2
No ratings yet
Module 4 Lecture - 2
65 pages
KNN and Decision Tree Algorithms
No ratings yet
KNN and Decision Tree Algorithms
50 pages
ML Unit 3 Qa
No ratings yet
ML Unit 3 Qa
26 pages
MLunit 2 Mynotes
No ratings yet
MLunit 2 Mynotes
15 pages
Decision Trees
No ratings yet
Decision Trees
26 pages
19 Standar Gorong Gorong Persegi Beton Bertulang Box Culvert Double
No ratings yet
19 Standar Gorong Gorong Persegi Beton Bertulang Box Culvert Double
12 pages
Nippon Steel Arcelor Mittal Catalogue
0% (1)
Nippon Steel Arcelor Mittal Catalogue
8 pages
Global Marketing Assignment 3 Group 2 Submitted by
No ratings yet
Global Marketing Assignment 3 Group 2 Submitted by
9 pages
Chap 10 NANOTECHNOLOGY
No ratings yet
Chap 10 NANOTECHNOLOGY
17 pages
SIVA
No ratings yet
SIVA
951 pages
Bhava Chandrika Missing Links of Hindu Astrology
No ratings yet
Bhava Chandrika Missing Links of Hindu Astrology
32 pages
DriveDxReport - APPLE SSD SM0256G - 2023-03-23 - 00-40-37-603
No ratings yet
DriveDxReport - APPLE SSD SM0256G - 2023-03-23 - 00-40-37-603
4 pages
Localized Wind Systems - Final
No ratings yet
Localized Wind Systems - Final
6 pages
AC Service Appointment and Maintenance Form
No ratings yet
AC Service Appointment and Maintenance Form
11 pages
Easter Island - Integrated Writing
100% (1)
Easter Island - Integrated Writing
3 pages
6.1. Electrochemistry
No ratings yet
6.1. Electrochemistry
77 pages
Syllabus Decoded UPSC
No ratings yet
Syllabus Decoded UPSC
30 pages
Week 2 Actual Class
No ratings yet
Week 2 Actual Class
72 pages
Bicycle Kinetic Energy Recovery Design
No ratings yet
Bicycle Kinetic Energy Recovery Design
65 pages
Artificial - Intelegence-1 - Autosaved
No ratings yet
Artificial - Intelegence-1 - Autosaved
155 pages
Hungary - SME Fact Sheet 2022
No ratings yet
Hungary - SME Fact Sheet 2022
1 page
Inserto Fibrinogeno C
No ratings yet
Inserto Fibrinogeno C
3 pages
72508FG SDS en
No ratings yet
72508FG SDS en
12 pages
Influence of Matrix Compounds On Oxidative Hair Dyes by HPLC
No ratings yet
Influence of Matrix Compounds On Oxidative Hair Dyes by HPLC
18 pages
Engineering Specification For MV Switchgear
No ratings yet
Engineering Specification For MV Switchgear
21 pages
Multi-Layered Two Wheeler Parking System
No ratings yet
Multi-Layered Two Wheeler Parking System
16 pages
G11 - Motion in Circle
No ratings yet
G11 - Motion in Circle
34 pages
Engineering Internship Insights
No ratings yet
Engineering Internship Insights
57 pages
Properties of Plastic - eMachineShop PDF
100% (1)
Properties of Plastic - eMachineShop PDF
3 pages
Panasonic Batteries Vrla For Professionals - Interactive PDF
No ratings yet
Panasonic Batteries Vrla For Professionals - Interactive PDF
148 pages
Elbow Friction Massage
No ratings yet
Elbow Friction Massage
5 pages
Inference Based CBT Questions
No ratings yet
Inference Based CBT Questions
8 pages
Ice Cream AAK
100% (1)
Ice Cream AAK
12 pages
Student BiomeViewer - Environmental Science Homework
No ratings yet
Student BiomeViewer - Environmental Science Homework
3 pages
Inventory Inspection
No ratings yet
Inventory Inspection
1 page

Classification Algorithms

Uploaded by

Classification Algorithms

Uploaded by

Confusion Matrix 1

Confusion Matrix is a performance measurement tool for machine learning models.

Fig.12 Confusion Metrix

All performance parameters, descriptions and their equations shown below.

Machine Learning is basically a part of AI that provides Statistic

DL Figure 10 shows algorithms that are used in this work.

Fig.10 Used Machine Learning Algorithms

The K-Nearest Neighbors (K-NN) algorithm is a popular Machine

 P+ = Probability of True values

 P- = Probability of False values Fig. 11 Entropy vs Gini Impurity

1. E(S): The current entropy on our subset S, before any split

There are multiple tree models to choose from based on their

Node splitting, or simply splitting, divides a node into multiple sub-

Steps to split a decision tree using Information Gain:

Steps to split a decision tree using Information Gain:

Here are the steps to split a decision tree using Chi-Square:

• Random Forest • AdaBoost

Random Forest is basically a type of Bagging Technique. d= #of Samples in Dataset

You might also like