0% found this document useful (0 votes)

311 views10 pages

Confusion Matrix in Machine Learning

The confusion matrix is a tool used to evaluate the performance of classification models by comparing predicted values with actual values. It categorizes predictions into true positives, true negatives, false positives, and false negatives, allowing for the calculation of various metrics such as accuracy, precision, and recall. Understanding the confusion matrix is essential for assessing model performance and identifying types of errors in predictions.

Uploaded by

dgpguru

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

311 views10 pages

Confusion Matrix in Machine Learning

Uploaded by

dgpguru

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

Confusion Matrix in Machine Learning

The confusion matrix is a matrix used to determine the performance of

the classification models for a given set of test data. It can only be
determined if the true values for test data are known. The matrix itself
can be easily understood, but the related terminologies may be confusing.
Since it shows the errors in the model performance in the form of a
matrix, hence also known as an error matrix. Some features of Confusion
matrix are given below:

o For the 2 prediction classes of classifiers, the matrix is of 2*2 table,

for 3 classes, it is 3*3 table, and so on.
o The matrix is divided into two dimensions, that are predicted
values and actual values along with the total number of
predictions.
o Predicted values are those values, which are predicted by the
model, and actual values are the true values for the given
observations.
o It looks like the below table:

The above table has the following cases:

o True Negative: Model has given prediction No, and the real or
actual value was also No.
o True Positive: The model has predicted yes, and the actual value
was also true.
o False Negative: The model has predicted no, but the actual value
was Yes, it is also called as Type-II error.
o False Positive: The model has predicted Yes, but the actual value
was No. It is also called a Type-I error.

Need for Confusion Matrix in Machine learning

o It evaluates the performance of the classification models, when they
make predictions on test data, and tells how good our classification
model is.
o It not only tells the error made by the classifiers but also the type of
errors such as it is either type-I or type-II error.
o With the help of the confusion matrix, we can calculate the different
parameters for the model, such as accuracy, precision, etc.

Example: We can understand the confusion matrix using an example.

Suppose we are trying to create a model that can predict the result for the
disease that is either a person has that disease or not. So, the confusion
matrix for this is given as:

From the above example, we can conclude that:

o The table is given for the two-class classifier, which has two
predictions "Yes" and "NO." Here, Yes defines that patient has the
disease, and No defines that patient does not has that disease.
o The classifier has made a total of 100 predictions. Out of 100
predictions, 89 are true predictions, and 11 are incorrect
predictions.
o The model has given prediction "yes" for 32 times, and "No" for 68
times. Whereas the actual "Yes" was 27, and actual "No" was 73
times.

Calculations using Confusion Matrix:

We can perform various calculations for the model, such as the model's
accuracy, using this matrix. These calculations are given below:

o Classification Accuracy: It is one of the important parameters to

determine the accuracy of the classification problems. It defines
how often the model predicts the correct output. It can be
calculated as the ratio of the number of correct predictions made by
the classifier to all number of predictions made by the classifiers.
The formula is given below:

o Misclassification rate: It is also termed as Error rate, and it

defines how often the model gives the wrong predictions. The value
of error rate can be calculated as the number of incorrect
predictions to all number of the predictions made by the classifier.
The formula is given below:

o Precision: It can be defined as the number of correct outputs

provided by the model or out of all positive classes that have
predicted correctly by the model, how many of them were actually
true. It can be calculated using the below formula:

o Recall: It is defined as the out of total positive classes, how our

model predicted correctly. The recall must be as high as possible.

o F-measure: If two models have low precision and high recall or vice
versa, it is difficult to compare these models. So, for this purpose,
we can use F-score. This score helps us to evaluate the recall and
precision at the same time. The F-score is maximum if the recall is
equal to the precision. It can be calculated using the below formula:

Other important terms used in Confusion Matrix:

o Null Error rate: It defines how often our model would be incorrect
if it always predicted the majority class. As per the accuracy
paradox, it is said that "the best classifier has a higher error rate
than the null error rate."
o ROC Curve: The ROC is a graph displaying a classifier's
performance for all possible thresholds. The graph is plotted
between the true positive rate (on the Y-axis) and the false Positive
rate (on the x-axis).
Confusion Matrix Metrics

Figure 3: Confusion Matrix for a classifier

Consider a confusion matrix made for a classifier that classifies people based on
whether they speak English or Spanish.

From the above diagram, we can see that:

True Positives (TP) = 86

True Negatives (TN) = 79

False Positives (FP) = 12

False Negatives (FN) = 10

 Accuracy: The accuracy is used to find the portion of correctly classified values. It tells us
how often our classifier is right. It is the sum of all true values divided by total values.
Figure 4: Accuracy

In this case:

Accuracy = (86 +79) / (86 + 79 + 12 + 10) = 0.8823 = 88.23%

 Precision: Precision is used to calculate the model's ability to classify positive values
correctly. It is the true positives divided by the total number of predicted positive values.

Figure 5: Precision

In this case,

Precision = 86 / (86 + 12) = 0.8775 = 87.75%

 Recall: It is used to calculate the model's ability to predict positive values. "How often
does the model predict the correct positive values?". It is the true positives divided by the
total number of actual positive values.

Figure 6: Recall

In this case,

Recall = 86 / (86 + 10) = 0.8983 = 89.83%

 F1-Score: It is the harmonic mean of Recall and Precision. It is useful when you need to
take both Precision and Recall into account.
Figure 7: F1-Score

In this case,

F1-Score = (2* 0.8775 * 0.8983) / (0.8775 + 0.8983) = 0.8877 = 88.77%

Scaling a Confusion Matrix

To scale a confusion matrix, increase the number of rows and columns. All the True
Positives will be along the diagonal. The other values will be False Positives or False
Negatives.

Figure 12: Scaling down our dataset

Now that we understand what a confusion matrix is and its inner working, let's
explore how we find the accuracy of a model with a hands-on demo on confusion
matrix with Python.

Confusion Matrix With Python

We'll build a logistic regression model using a heart attack dataset to predict if a
patient is at risk of a heart attack.

Depicted below is the dataset that we'll be using for this demonstration.
Figure 9: Heart Attack Dataset

Let’s import the necessary libraries to create our model.

Figure 10: Importing Confusion Matrix in python

We can import the confusion matrix function from sklearn.metrics. Let’s split our
dataset into the input features and target output dataset.
Figure 11: Splitting data into variables and target dataset

As we can see, our data contains a massive range of values, some are single digits,
and some have three numbers. To make our calculations more straightforward, we
will scale our data and reduce it to a small range of values using the Standard
Scaler.

Figure 12: Scaling down our dataset

Now, let's split our dataset into two: one to train our model and another to test our
model. To do this, we use train_test_split imported from sklearn. Using a Logistic
Regression Model, we will perform Classification on our train data and predict our
test data to check the accuracy.
Figure 13: Performing classification

To find the accuracy of a confusion matrix and all other metrics, we can import
accuracy_score and classification_report from the same library.

Figure 14: Accuracy of classifier

The accuracy_score gives us the accuracy of our classifier

Figure 15: Confusion Matrix for data

Using the predicted values(pred) and our actual values(y_test), we can create a
confusion matrix with the confusion_matrix function.

Then, using the ravel() method of our confusion_matrix function, we can get the True
Positive, True Negative, False Positive, and False Negative values.

Figure 16: Extracting matrix value

Figure 17: Confusion Matrix Metrics

Finally, using the classification_report, we can find the values of various metrics of
our confusion matrix.

CS 601 ML Lab Manual
0% (1)
CS 601 ML Lab Manual
14 pages
SVM Guide for Data Scientists
No ratings yet
SVM Guide for Data Scientists
24 pages
NLP End Sem Paper - Evaluation Scheme
No ratings yet
NLP End Sem Paper - Evaluation Scheme
14 pages
AI Lab Program - BCA - NEP - Final - CM
No ratings yet
AI Lab Program - BCA - NEP - Final - CM
1 page
Top 45 Machine Learning Interview Questions in 2025
100% (1)
Top 45 Machine Learning Interview Questions in 2025
37 pages
Inferential Statistics
No ratings yet
Inferential Statistics
111 pages
Module 3 Informed Search Techniques and Knowledge Representation
No ratings yet
Module 3 Informed Search Techniques and Knowledge Representation
26 pages
Informed Search
No ratings yet
Informed Search
36 pages
5th Sem BCS515B - AI - Module3
No ratings yet
5th Sem BCS515B - AI - Module3
113 pages
Feature Selection Techniques in Machine Learning
No ratings yet
Feature Selection Techniques in Machine Learning
49 pages
Machine Learning Assignment 1 Submission Date: 5/10/2020
No ratings yet
Machine Learning Assignment 1 Submission Date: 5/10/2020
1 page
Data Science Masters Program - Curriculum-Updated 2019
No ratings yet
Data Science Masters Program - Curriculum-Updated 2019
52 pages
The Price Prediction For Used Cars Using Multiple Linear Regression Model
No ratings yet
The Price Prediction For Used Cars Using Multiple Linear Regression Model
6 pages
006 Practical List of DM-2023
No ratings yet
006 Practical List of DM-2023
1 page
Heuristic Search Techniques
No ratings yet
Heuristic Search Techniques
54 pages
SUpport Vector Machine
No ratings yet
SUpport Vector Machine
28 pages
Top 10 Machine Learning Algo PDF
No ratings yet
Top 10 Machine Learning Algo PDF
15 pages
Module-02 AIML NOTES
No ratings yet
Module-02 AIML NOTES
29 pages
Hadoop Tutorial - YDN
No ratings yet
Hadoop Tutorial - YDN
14 pages
Word and Syntactic Analysis Guide
No ratings yet
Word and Syntactic Analysis Guide
278 pages
Machine Learning Classification Guide
No ratings yet
Machine Learning Classification Guide
7 pages
Vision-Face Recognition Attendance Monitoring System For Surveillance Using Deep Learning Technology and Computer Vision
No ratings yet
Vision-Face Recognition Attendance Monitoring System For Surveillance Using Deep Learning Technology and Computer Vision
5 pages
Machine Learning Interview Questions PDF
No ratings yet
Machine Learning Interview Questions PDF
14 pages
Ai Unit-V Expert Systems
No ratings yet
Ai Unit-V Expert Systems
20 pages
Ai ML
No ratings yet
Ai ML
11 pages
Prolog Basics for Beginners
No ratings yet
Prolog Basics for Beginners
15 pages
Bca Ctis Sem-5 Introduction To Data Science
No ratings yet
Bca Ctis Sem-5 Introduction To Data Science
14 pages
Chapter4 - Heuristic Search
No ratings yet
Chapter4 - Heuristic Search
18 pages
Ethical Hacking Question Bank
No ratings yet
Ethical Hacking Question Bank
4 pages
Motor Insurance Fraud Analytics
No ratings yet
Motor Insurance Fraud Analytics
49 pages
Daa Assignment
No ratings yet
Daa Assignment
12 pages
Probability and Statistics For ML - Cwa
No ratings yet
Probability and Statistics For ML - Cwa
822 pages
Chapter 1 Artificial Intelligence
No ratings yet
Chapter 1 Artificial Intelligence
38 pages
ف1
No ratings yet
ف1
4 pages
Artificialintelligence
No ratings yet
Artificialintelligence
18 pages
Unit Ii
No ratings yet
Unit Ii
20 pages
01 - ML Introduction - Course Outline
No ratings yet
01 - ML Introduction - Course Outline
21 pages
JNTUA MCA V Semester R17 Syllabus
No ratings yet
JNTUA MCA V Semester R17 Syllabus
24 pages
RBF Neural Network
No ratings yet
RBF Neural Network
34 pages
Data Modelling and Visualization
No ratings yet
Data Modelling and Visualization
31 pages
Machine Learning Guide Line
No ratings yet
Machine Learning Guide Line
10 pages
AI Course Outline
0% (1)
AI Course Outline
2 pages
Seminar Title: Natural Language Processing: Understanding and Generating Human Language
No ratings yet
Seminar Title: Natural Language Processing: Understanding and Generating Human Language
20 pages
AIML Cheatsheet - Coders - Section
No ratings yet
AIML Cheatsheet - Coders - Section
47 pages
ML Unit4
No ratings yet
ML Unit4
41 pages
NLP Notes
No ratings yet
NLP Notes
203 pages
AI ML Interview Introduction
No ratings yet
AI ML Interview Introduction
15 pages
Data Science Workshop
No ratings yet
Data Science Workshop
6 pages
10 - Sugeno-TSK Model
100% (2)
10 - Sugeno-TSK Model
23 pages
AI Exam Question
No ratings yet
AI Exam Question
25 pages
Unit 4 AI Notes
No ratings yet
Unit 4 AI Notes
18 pages
Bai602 ML Lesson Plan 2024-25 Even Aiml Dept
No ratings yet
Bai602 ML Lesson Plan 2024-25 Even Aiml Dept
5 pages
Bayesian Machine Learning
No ratings yet
Bayesian Machine Learning
127 pages
PPT1
No ratings yet
PPT1
93 pages
Gaussian Mixture Models Unit-III
No ratings yet
Gaussian Mixture Models Unit-III
13 pages
Design A Learning System in Machine Learning
No ratings yet
Design A Learning System in Machine Learning
41 pages
Natural Language Toolkit NLTK PDF
No ratings yet
Natural Language Toolkit NLTK PDF
23 pages
CS 224n Word2Vec Assignment Guide
No ratings yet
CS 224n Word2Vec Assignment Guide
4 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
25 pages
UNIT4 Confusion Matrix
No ratings yet
UNIT4 Confusion Matrix
12 pages
Unit 3 Cyberspace and Cyber Crimes
No ratings yet
Unit 3 Cyberspace and Cyber Crimes
12 pages
Bsa 3RD Sem Question
No ratings yet
Bsa 3RD Sem Question
3 pages
Computer Networks Topology
No ratings yet
Computer Networks Topology
24 pages
Caseselect Unix
No ratings yet
Caseselect Unix
2 pages
Unit 2 ICT and Digital Divide
No ratings yet
Unit 2 ICT and Digital Divide
15 pages
Unit 1 Attributes of Information
No ratings yet
Unit 1 Attributes of Information
14 pages
Java MCQ
No ratings yet
Java MCQ
7 pages
AI Project Cycle Class 9 MCQ
No ratings yet
AI Project Cycle Class 9 MCQ
2 pages
QnA Introduction To AI
No ratings yet
QnA Introduction To AI
1 page
QP Iac 2023
No ratings yet
QP Iac 2023
2 pages
Avltree
No ratings yet
Avltree
3 pages
Mil Bca Bca 2ND Sem
No ratings yet
Mil Bca Bca 2ND Sem
2 pages
Java Interview Questions
No ratings yet
Java Interview Questions
15 pages
Java Worksheet
No ratings yet
Java Worksheet
3 pages
Unix Loop Program
No ratings yet
Unix Loop Program
3 pages
Application of Data Structure
No ratings yet
Application of Data Structure
3 pages
Collision Resolution Techniques
No ratings yet
Collision Resolution Techniques
10 pages
Revision Chapter II XII
No ratings yet
Revision Chapter II XII
3 pages
Bca Unix Lab
No ratings yet
Bca Unix Lab
10 pages
Assignment List and Dictionery
No ratings yet
Assignment List and Dictionery
2 pages
Cbse Class Xi Test
No ratings yet
Cbse Class Xi Test
1 page
Cbse Final MCQ Xii 2024
No ratings yet
Cbse Final MCQ Xii 2024
10 pages
C MCQ FOR BTECH SEM II
No ratings yet
C MCQ FOR BTECH SEM II
8 pages
Assignment 3
No ratings yet
Assignment 3
11 pages
Difference Between Big Oh
No ratings yet
Difference Between Big Oh
1 page
C++ OOP Examples: Classes & Inheritance
No ratings yet
C++ OOP Examples: Classes & Inheritance
10 pages
Assignment1 Xi
No ratings yet
Assignment1 Xi
3 pages
Checksum Networking
No ratings yet
Checksum Networking
12 pages
C Programming Lab Guide
No ratings yet
C Programming Lab Guide
27 pages
Coa Unit 2
No ratings yet
Coa Unit 2
9 pages
Biren Resume Senior AI Engineer
No ratings yet
Biren Resume Senior AI Engineer
2 pages
Unit-8: Computer Animation 8.1 Overview
No ratings yet
Unit-8: Computer Animation 8.1 Overview
5 pages
TR18 NGI BR Cracking-hitag2-Crypto
No ratings yet
TR18 NGI BR Cracking-hitag2-Crypto
148 pages
Convolutional Coding Presentation
No ratings yet
Convolutional Coding Presentation
23 pages
Infinite Impulse Response Filter Design
No ratings yet
Infinite Impulse Response Filter Design
6 pages
QB Array Program Answers
No ratings yet
QB Array Program Answers
15 pages
Unit 10
No ratings yet
Unit 10
14 pages
AI & ML in Transportation Systems
100% (1)
AI & ML in Transportation Systems
6 pages
Nature-Inspired Optimizers: Theories, Literature Reviews and Applications Seyedali Mirjalili Download
100% (2)
Nature-Inspired Optimizers: Theories, Literature Reviews and Applications Seyedali Mirjalili Download
60 pages
DL Notes
No ratings yet
DL Notes
34 pages
Power System Operation and Control
50% (2)
Power System Operation and Control
2 pages
NP and Computational Intractability
No ratings yet
NP and Computational Intractability
11 pages
Time Value of Money: Discounted Cash Flow Topics Covered
100% (1)
Time Value of Money: Discounted Cash Flow Topics Covered
16 pages
GitHub Repos
No ratings yet
GitHub Repos
2 pages
Poster Compumag 2
No ratings yet
Poster Compumag 2
2 pages
William E. Baylis - Edit - Clifford Algebras With Applications
100% (1)
William E. Baylis - Edit - Clifford Algebras With Applications
521 pages
Honerkamp Et Al (Eds) - Strucural Elements in Particle Physics and Statistical Mechanics
No ratings yet
Honerkamp Et Al (Eds) - Strucural Elements in Particle Physics and Statistical Mechanics
377 pages
Distance Sort
No ratings yet
Distance Sort
6 pages
Design and Simulation of PID Controller Using FPGA
No ratings yet
Design and Simulation of PID Controller Using FPGA
5 pages
Topic 2 Errors
No ratings yet
Topic 2 Errors
144 pages
Logical Processor Neutral Atoms
No ratings yet
Logical Processor Neutral Atoms
32 pages
Case Study On Demand Forecasting
No ratings yet
Case Study On Demand Forecasting
3 pages
Department of Education: Table of Specification Quarter Examination Local Guiding Services NC Ii
No ratings yet
Department of Education: Table of Specification Quarter Examination Local Guiding Services NC Ii
2 pages
The Schrodinger Equation: Fisika Kuantum
No ratings yet
The Schrodinger Equation: Fisika Kuantum
13 pages
Machine Learning (ML) in Medicine - Review, Applications, and Challenges PDF
No ratings yet
Machine Learning (ML) in Medicine - Review, Applications, and Challenges PDF
52 pages
Chapter 3 Ann
No ratings yet
Chapter 3 Ann
26 pages
Well Placement Optimization Guide
No ratings yet
Well Placement Optimization Guide
14 pages
Chapter9 Regression Multicollinearity
No ratings yet
Chapter9 Regression Multicollinearity
25 pages
Ect205 Scheme 2
No ratings yet
Ect205 Scheme 2
11 pages
Markov Decision Processes
100% (1)
Markov Decision Processes
104 pages

Confusion Matrix in Machine Learning

Uploaded by

Confusion Matrix in Machine Learning

Uploaded by

Confusion Matrix in Machine Learning

The confusion matrix is a matrix used to determine the performance of

o For the 2 prediction classes of classifiers, the matrix is of 2*2 table,

The above table has the following cases:

Need for Confusion Matrix in Machine learning

Example: We can understand the confusion matrix using an example.

From the above example, we can conclude that:

Calculations using Confusion Matrix:

o Classification Accuracy: It is one of the important parameters to

o Misclassification rate: It is also termed as Error rate, and it

o Precision: It can be defined as the number of correct outputs

o Recall: It is defined as the out of total positive classes, how our

Other important terms used in Confusion Matrix:

Figure 3: Confusion Matrix for a classifier

From the above diagram, we can see that:

True Positives (TP) = 86

True Negatives (TN) = 79

False Positives (FP) = 12

False Negatives (FN) = 10

Accuracy = (86 +79) / (86 + 79 + 12 + 10) = 0.8823 = 88.23%

Precision = 86 / (86 + 12) = 0.8775 = 87.75%

Recall = 86 / (86 + 10) = 0.8983 = 89.83%

F1-Score = (2* 0.8775 * 0.8983) / (0.8775 + 0.8983) = 0.8877 = 88.77%

Scaling a Confusion Matrix

Figure 12: Scaling down our dataset

Confusion Matrix With Python

Let’s import the necessary libraries to create our model.

Figure 10: Importing Confusion Matrix in python

Figure 12: Scaling down our dataset

Figure 14: Accuracy of classifier

The accuracy_score gives us the accuracy of our classifier

Figure 15: Confusion Matrix for data

Figure 16: Extracting matrix value

You might also like