0% found this document useful (0 votes)

7 views48 pages

Module 3 Supervised ML Algo

Uploaded by

vishal.patil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views48 pages

Module 3 Supervised ML Algo

Uploaded by

vishal.patil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 48

Module 3

Supervised Machine
Learning algorithms
Naïve Bayes Classifier Algorithm
• Naïve Bayes algorithm is a supervised learning algorithm, which is based
on Bayes theorem and used for solving classification problems.
• It is mainly used in text classification that includes a high-dimensional
training dataset.
• Naïve Bayes Classifier is one of the simple and most effective Classification
algorithms which helps in building the fast machine learning models that
can make quick predictions.
• It is a probabilistic classifier, which means it predicts on the basis of the
probability of an object.
• Some popular examples of Naïve Bayes Algorithm are spam filtration,
Sentimental analysis, and classifying articles.
Why is it called Naïve Bayes?
• The Naïve Bayes algorithm is comprised of two words Naïve and Bayes,
Which can be described as:
• Naïve: It is called Naïve because it assumes that the occurrence of a
certain feature is independent of the occurrence of other features. Such
as if the fruit is identified on the bases of color, shape, and taste, then
red, spherical, and sweet fruit is recognized as an apple. Hence each
feature individually contributes to identify that it is an apple without
depending on each other.
• Bayes: It is called Bayes because it depends on the principle of
Bayes' Theorem
•.
Bayes' Theorem:
• Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is used to
determine the probability of a hypothesis with prior knowledge. It depends
on the conditional probability.
• The formula for Bayes' theorem is given as:

• P(A|B) is Posterior probability: Probability of hypothesis A on the observed event

B.
• P(B (evidence) |A (hypothesis) ) is Likelihood probability: Probability of the
evidence given that the probability of a hypothesis is true.
• P(A) is Prior Probability: Probability of hypothesis before observing the evidence.
Algorithm of Naïve Bayes'
Classifier:
• Compute the prior probability for the target class.
• Compute frequency matrix and Likelihood probability for each of the
feature.
• Use Bayes theorem to calculate the posterior probability of all
hypotheses.
• Use maximum posteriori hypothesis to classify the test object to the
hypothesis with the highest probability.
• Advantages of Naïve Bayes Classifier:
• Naïve Bayes is one of the fast and easy ML algorithms to predict a
class of datasets.
• It can be used for Binary as well as Multi-class Classifications.
• It performs well in Multi-class predictions as compared to the other
Algorithms.
• It is the most popular choice for text classification problems.
• Disadvantages of Naïve Bayes Classifier:
• Naive Bayes assumes that all features are independent or unrelated,
so it cannot learn the relationship between features.
Applications of Naïve Bayes
Classifier
• It is used for Credit Scoring.
• It is used in medical data classification.
• It can be used in real-time predictions because Naïve Bayes Classifier
is an eager learner.
• It is used in Text classification such as Spam filtering and Sentiment
analysis.
Types of Naïve Bayes Model
• Gaussian: The Gaussian model assumes that features follow a normal distribution.
This means if predictors take continuous values instead of discrete, then the model
assumes that these values are sampled from the Gaussian distribution.
• Multinomial: The Multinomial Naïve Bayes classifier is used when the data is
multinomial distributed. It is primarily used for document classification problems, it
means a particular document belongs to which category such as Sports, Politics,
education, etc.
The classifier uses the frequency of words for the predictors.
• Bernoulli: The Bernoulli classifier works similar to the Multinomial classifier, but the
predictor variables are the independent Booleans variables. Such as if a particular
word is present or not in a document. This model is also famous for document
classification tasks.
Example
• Calculate

• X=(age=young, income=medium, student=yes, credit_rating=fair)

KNN(K-Nearest Neighbor)
• K-NN algorithm assumes the similarity between the new case/data and
available cases and put the new case into the category that is most similar
to the available categories.
• K-NN algorithm stores all the available data and classifies a new data point
based on the similarity. This means when new data appears then it can be
easily classified into a well suite category by using K- NN algorithm.
• It is also called a lazy learner algorithm because it does not learn from the
training set immediately instead it stores the dataset and at the time of
classification, it performs an action on the dataset.
• KNN algorithm at the training phase just stores the dataset and when it
gets new data, then it classifies that data into a category that is much
similar to the new data.
KNN(K-Nearest Neighbor)
• The K-NN working can be explained on the basis of the below
algorithm:
• Step-1: Select the number K of the neighbors
• Step-2: Calculate the Euclidean distance of K number of neighbors
• Step-3: Take the K nearest neighbors as per the calculated Euclidean
distance.
• Step-4: Among these k neighbors, count the number of the data points
in each category.
• Step-5: Assign the new data points to that category for which the
number of the neighbor is maximum.
• Step-6: Our model is ready.
KNN(K-Nearest Neighbor)
KNN(K-Nearest Neighbor)
KNN(K-Nearest Neighbor)
• Advantages of KNN Algorithm:
• It is simple to implement.
• It is robust to the noisy training data
• It can be more effective if the training data is large.
• Disadvantages of KNN Algorithm:
• Always needs to determine the value of K which may be
complex some time.
• The computation cost is high because of calculating the
distance between the data points for all the training
samples.
KNN(K-Nearest Neighbor)
• Example
KNN(K-Nearest Neighbor)
KNN(K-Nearest Neighbor)
KNN(K-Nearest Neighbor)
Decision tree
Introduction
• Decision Tree is a Supervised learning technique that can be used for
both classification and Regression problems, but mostly it is preferred for
solving Classification problems. It is a tree-structured classifier,
where internal nodes represent the features of a dataset, branches
represent the decision rules and each leaf node represents the outcome.
• In a Decision tree, there are two nodes, which are the Decision
Node and Leaf Node. Decision nodes are used to make any decision and
have multiple branches, whereas Leaf nodes are the output of those
decisions and do not contain any further branches.
• The decisions or the test are performed on the basis of features of the
given dataset.
• It is a graphical representation for getting all the possible solutions
to a problem/decision based on given conditions.
• It is called a decision tree because, similar to a tree, it starts with the
root node, which expands on further branches and constructs a tree-
like structure.
• In order to build a tree, we use the CART algorithm, which stands
for Classification and Regression Tree algorithm.
• A decision tree simply asks a question, and based on the answer
(Yes/No), it further split the tree into subtrees.
• Why use Decision Trees?

• Decision Trees usually mimic human thinking ability while making a

decision, so it is easy to understand.
• The logic behind the decision tree can be easily understood because it
shows a tree-like structure.
Decision Tree Terminologies
• Root Node: Root node is from where the decision tree starts. It
represents the entire dataset, which further gets divided into two or
more homogeneous sets.
• Leaf Node: Leaf nodes are the final output node, and the tree cannot
be segregated further after getting a leaf node.
• Decision node: Internal nodes.
Working of algorithm
• Step-1: Begin the tree with the root node, says S, which contains the complete
dataset.
• Step-2: Find the best attribute in the dataset using Attribute Selection Measure
(ASM).
• Step-3: Divide the S into subsets that contains possible values for the best
attributes.
• Step-4: Generate the decision tree node, which contains the best attribute.
• Step-5: Recursively make new decision trees using the subsets of the dataset
created in step -3. Continue this process until a stage is reached where you cannot
further classify the nodes and called the final node as a leaf node.
Attribute Selection Measures

• While implementing a Decision tree, the main issue arises that how to
select the best attribute for the root node and for sub-nodes. So, to
solve such problems there is a technique which is called as Attribute
selection measure or ASM. By this measurement, we can easily select
the best attribute for the nodes of the tree. There are two popular
techniques for ASM, which are:
• Information Gain
• Gini Index
1. Information Gain:
• It calculates how much information a feature provides us about a class.
• According to the value of information gain, we split the node and build the
decision tree.
• A decision tree algorithm always tries to maximize the value of information gain,
and a node/attribute having the highest information gain is split first. It can be
calculated using the below formula:

• Entropy: Entropy is a metric to measure the impurity in a given attribute. It

specifies randomness in data. Entropy can be calculated as:
Example
Suppose there is a candidate who has a job offer and wants to decide
whether he should accept the offer or Not. So, to solve this problem,
the decision tree starts with the root node (Salary attribute by ASM).
The root node splits further into the next decision node (distance from
the office) and one leaf node based on the corresponding labels. The
next decision node further gets split into one decision node (Cab
facility) and one leaf node. Finally, the decision node splits into two leaf
nodes (Accepted offers and Declined offer).
Example
2. Gini Index:

• Gini index is a measure of impurity or purity used while creating a

decision tree in the CART(Classification and Regression Tree)
algorithm.
• An attribute with the low Gini index should be preferred as compared
to the high Gini index.
• It only creates binary splits, and the CART algorithm uses the Gini
index to create binary splits.
• Gini index can be calculated using the below formula:
• Gini Index= 1- ∑j Pj^2
Steps to Calculate Gini for a split

• Calculate Gini for sub-nodes, using formula sum of square of

probability for success and failure (p²+q²).
• Calculate Gini for split using weighted Gini score of each node of that
split
Advantages and disadvantages
• Advantages of the Decision Tree
• It is simple to understand as it follows the same process which a human follow while making
any decision in real-life.
• It can be very useful for solving decision-related problems.
• It helps to think about all the possible outcomes for a problem.
• There is less requirement of data cleaning compared to other algorithms.

• Disadvantages of the Decision Tree

• The decision tree contains lots of layers, which makes it complex.
• It may have an overfitting issue, which can be resolved using the Random Forest algorithm.
• For more class labels, the computational complexity of the decision tree may increase.
Dataset
Linear regression
• Linear regression is one of the easiest and most popular Machine
Learning algorithms.
• It is a statistical method that is used for predictive analysis.
• Linear regression makes predictions for continuous/real or numeric
variables such as sales, salary, age, product price, etc.
• Linear regression algorithm shows a linear relationship between a
dependent (y) and one or more independent (y) variables, hence
called as linear regression.
Linear regression
• Since linear regression shows the linear relationship, which means it
finds how the value of the dependent variable is changing according
to the value of the independent variable.
• The linear regression model provides a sloped straight line
representing the relationship between the variables.
Linear regression

The values for x and y variables are training datasets for Linear Regression model
representation.
Types of Linear Regression
• Simple Linear Regression:
• If a single independent variable is used to predict the value of a
numerical dependent variable, then such a Linear Regression
algorithm is called Simple Linear Regression.
• Multiple Linear regression:
• If more than one independent variable is used to predict the value of
a numerical dependent variable, then such a Linear Regression
algorithm is called Multiple Linear Regression.
Linear Regression Line

• A linear line showing the relationship between the dependent and

independent variables is called a regression line.
• Positive Linear Relationship:
• If the dependent variable increases on the Y-axis and independent
variable increases on X-axis, then such a relationship is termed as a
Positive linear relationship.
Linear Regression Line

• Negative Linear Relationship:

• If the dependent variable decreases on the Y-axis and independent
variable increases on the X-axis, then such a relationship is called a
negative linear relationship.
Simple Linear Regression Model
• The Simple Linear Regression model can be represented using the below equation:

• Where,
• a0= It is the intercept of the Regression line (can be obtained putting x=0) or Bias
in ML.
a1= It is the slope of the regression line, which tells whether the line is increasing
or decreasing.
ε = The error term. (For a good model it will be negligible)
Formulas
• Slope of line a1
• a1(slope) = sum((xi-mean(x)) * (yi-mean(y))) / sum((xi – mean(x))^2)
• a0= mean(y) - a1 * mean(x)
OR
Example
Year GDP 4wheeler_passengar_vehicle_sale(in
lakhs)
2011 6.2 26.3
2012 6.5 26.6
2013 5.4 25
2014 6.5 26
2015 7.1 27.9
2016 7.9 30.4
Multivariate\multiple linear
regression
• Response variable is affected by more than one predictor variable; for
such cases, the Multiple Linear Regression algorithm is used.
• Multiple Linear Regression is one of the important regression algorithms
which models the linear relationship between a single dependent
continuous variable and more than one independent variable.
• For MLR, the dependent or target variable(Y) must be the
continuous/real, but the predictor or independent variable may be of
continuous or categorical form.
• Each feature variable must model the linear relationship with the
dependent variable.
• MLR tries to fit a regression line through a multidimensional space of
data-points.
Multivariate\multiple linear
regression
• In Multiple Linear Regression, the target variable(Y) is a linear
combination of multiple predictor variables x1, x2, x3, ...,xn. Since it is
an enhancement of Simple Linear Regression, so the same is applied
for the multiple linear regression equation, the equation becomes:
Y = a0 + a1X1 + a2X2 + a3X3 + a4X4 +…+ anXn

• Where,
• Y= Output/Response variable
• a0, a1, a2, a3 , an....= Coefficients of the model.
• x1, x2, x3, x4,...= Various Independent/feature variable
Multivariate\multiple linear
regression
b1= [(Σx22)(Σx1y) – (Σx1x2)(Σx2y)] / [(Σx12) (Σx22) – (Σx1x2)2]

b2= [(Σx12)(Σx2y) – (Σx1x2)(Σx1y)] / [(Σx12) (Σx22) – (Σx1x2)2]

b0=
Multivariate\multiple linear
regression
• Σx12 = ΣX12 – (ΣX1)2 / n
• Σx22 = ΣX22 – (ΣX2)2 / n
• Σx1y = ΣX1y – (ΣX1Σy) / n
• Σx2y = ΣX2y – (ΣX2Σy) / n
• Σx1x2 = ΣX1X2 – (ΣX1ΣX2) / n
Example
X1 X2 Y
1 2 3
2 3 4
3 1 6
4 5 8
5 4 10
Sum=15 15 31
Mean=3 3 6.2

ML Unit-Ii Notes
No ratings yet
ML Unit-Ii Notes
17 pages
ML Notes
No ratings yet
ML Notes
50 pages
Machine Learning Algorithms Laiki
No ratings yet
Machine Learning Algorithms Laiki
123 pages
ML Unit-2
No ratings yet
ML Unit-2
26 pages
ML Assignment 2 PDF
No ratings yet
ML Assignment 2 PDF
9 pages
Classification and Clustering Algorithm Notes
No ratings yet
Classification and Clustering Algorithm Notes
19 pages
Module 5
No ratings yet
Module 5
16 pages
Unit 4 Classification & Prediction
No ratings yet
Unit 4 Classification & Prediction
10 pages
Module Iii
No ratings yet
Module Iii
15 pages
Intro to Machine Learning Basics
No ratings yet
Intro to Machine Learning Basics
17 pages
Unit 4
No ratings yet
Unit 4
54 pages
MODULE 3 Classification
No ratings yet
MODULE 3 Classification
5 pages
Big Data Notes
No ratings yet
Big Data Notes
33 pages
Unit - II
No ratings yet
Unit - II
37 pages
Classification
No ratings yet
Classification
7 pages
DW&M Unit 3 Part I
No ratings yet
DW&M Unit 3 Part I
101 pages
2023-24 ML Notes 2
No ratings yet
2023-24 ML Notes 2
16 pages
ML Unit-III
No ratings yet
ML Unit-III
30 pages
Machine Learning - Iii
No ratings yet
Machine Learning - Iii
53 pages
Cse Vsem 503 B PR Unit 2 Notes
No ratings yet
Cse Vsem 503 B PR Unit 2 Notes
17 pages
Refer For KNNDecison Tree SVM
No ratings yet
Refer For KNNDecison Tree SVM
90 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Unit 3 - Supervise Learning Classification
No ratings yet
Unit 3 - Supervise Learning Classification
23 pages
Machine Learning: Supervised Learning Basics
No ratings yet
Machine Learning: Supervised Learning Basics
46 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
CH 4
No ratings yet
CH 4
76 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
21 pages
Slide 3
No ratings yet
Slide 3
23 pages
Unit 3
No ratings yet
Unit 3
8 pages
ML UNIT 2 Sir
No ratings yet
ML UNIT 2 Sir
46 pages
ML-PPT Unit Iii-1
No ratings yet
ML-PPT Unit Iii-1
38 pages
Decision Tree & Random ForestNotes
No ratings yet
Decision Tree & Random ForestNotes
11 pages
Unit 3,4,5 ML (CS - AI)
No ratings yet
Unit 3,4,5 ML (CS - AI)
37 pages
Unit IV Da Online - PPTX 2 82
No ratings yet
Unit IV Da Online - PPTX 2 82
81 pages
U4 ML Updated
No ratings yet
U4 ML Updated
32 pages
Supervised Learning Algorithm DT
No ratings yet
Supervised Learning Algorithm DT
15 pages
Unit 3 Classification
No ratings yet
Unit 3 Classification
15 pages
Algorithms New
No ratings yet
Algorithms New
8 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
DWMExp 5
No ratings yet
DWMExp 5
6 pages
Machine Learning: Supervised vs Unsupervised
100% (1)
Machine Learning: Supervised vs Unsupervised
47 pages
Supervised Learning
No ratings yet
Supervised Learning
67 pages
ML4 ML Algorithms
No ratings yet
ML4 ML Algorithms
123 pages
Chapter 7
No ratings yet
Chapter 7
74 pages
Machine Learning QNA
No ratings yet
Machine Learning QNA
1 page
KNN and Decision Tree Algorithms
No ratings yet
KNN and Decision Tree Algorithms
50 pages
Unit 3 (MLT)
No ratings yet
Unit 3 (MLT)
42 pages
DWM - Module 3
No ratings yet
DWM - Module 3
22 pages
Chapt 2 Notes
No ratings yet
Chapt 2 Notes
12 pages
Supervised Learning - SVM - DT
No ratings yet
Supervised Learning - SVM - DT
43 pages
Unit Ii
No ratings yet
Unit Ii
102 pages
ML Notes (III BCA)
No ratings yet
ML Notes (III BCA)
64 pages
ML Unit 3 Part 3
No ratings yet
ML Unit 3 Part 3
33 pages
Co-2 ML 2019
No ratings yet
Co-2 ML 2019
71 pages
Unit 5
No ratings yet
Unit 5
28 pages
CH 04 Classification Techniques
No ratings yet
CH 04 Classification Techniques
89 pages
ML Unit II - Final
No ratings yet
ML Unit II - Final
138 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
Example
No ratings yet
Example
3 pages
Module 4 Supervised Algoritms-II
No ratings yet
Module 4 Supervised Algoritms-II
40 pages
Solved Example of Decision Tree Gini Index
No ratings yet
Solved Example of Decision Tree Gini Index
4 pages
Module 1
No ratings yet
Module 1
28 pages
Deep Learning CNN
No ratings yet
Deep Learning CNN
2 pages
New Doc 2018-10-03 17.22.00 - 20200303130026 PDF
No ratings yet
New Doc 2018-10-03 17.22.00 - 20200303130026 PDF
1 page
MOS Module 1
No ratings yet
MOS Module 1
19 pages
Display PDF - PHP
No ratings yet
Display PDF - PHP
4 pages
Print - Udyam Registration Certificate
No ratings yet
Print - Udyam Registration Certificate
4 pages
Registration
No ratings yet
Registration
1 page
Fy Mtech Fem Assignments
No ratings yet
Fy Mtech Fem Assignments
4 pages
IEDheater E On Y Axis
No ratings yet
IEDheater E On Y Axis
1 page
Dynamic Response
No ratings yet
Dynamic Response
13 pages
AEE-Tutorial 1
No ratings yet
AEE-Tutorial 1
7 pages
Task 2 Model Plan
No ratings yet
Task 2 Model Plan
2 pages
Pattern Recognition Introduction Features Classifiers and Principles de Gruyter Textbook 2nd Edition Beyerer PDF Download
100% (2)
Pattern Recognition Introduction Features Classifiers and Principles de Gruyter Textbook 2nd Edition Beyerer PDF Download
46 pages
AI ML With Python Presentation
No ratings yet
AI ML With Python Presentation
20 pages
Explainable Artificial Intelligence XAI For Malware Analysis A Survey of Techniques Applications and Open Challenges
No ratings yet
Explainable Artificial Intelligence XAI For Malware Analysis A Survey of Techniques Applications and Open Challenges
30 pages
Artificial Intelligence and Machine Learning For Edge Computing 1St Edition Rajiv Pandey - Ebook PDF PDF Download
No ratings yet
Artificial Intelligence and Machine Learning For Edge Computing 1St Edition Rajiv Pandey - Ebook PDF PDF Download
53 pages
Exam 1Z0 1122 25 Dumps - OCI 2025 AI Foundations Associate
No ratings yet
Exam 1Z0 1122 25 Dumps - OCI 2025 AI Foundations Associate
5 pages
Generative AI For Predicting 2D and 3D Wildfire Spread: Beyond Physics-Based Models and Traditional Deep Learning
No ratings yet
Generative AI For Predicting 2D and 3D Wildfire Spread: Beyond Physics-Based Models and Traditional Deep Learning
33 pages
12th International Conference On Artificial Intelligence & Applications (ARIA 2025)
No ratings yet
12th International Conference On Artificial Intelligence & Applications (ARIA 2025)
3 pages
Po and Co
No ratings yet
Po and Co
5 pages
Diabetic Retinopathy Review Paper
No ratings yet
Diabetic Retinopathy Review Paper
2 pages
17B1NCI731 - ML&NLP - CD - Odd - 25-26
No ratings yet
17B1NCI731 - ML&NLP - CD - Odd - 25-26
2 pages
Q & A Unit 3 - Clustering Methods
No ratings yet
Q & A Unit 3 - Clustering Methods
21 pages
Mtech ML Imp Question Bank
No ratings yet
Mtech ML Imp Question Bank
2 pages
KNN Example
No ratings yet
KNN Example
9 pages
Direct Utility Estimation
No ratings yet
Direct Utility Estimation
3 pages
Paper For Pub
No ratings yet
Paper For Pub
11 pages
CV Unit-1
No ratings yet
CV Unit-1
26 pages
Edge Detection
No ratings yet
Edge Detection
43 pages
ML Learning
No ratings yet
ML Learning
5 pages
CS414-Lesson 07. Recurrent Neural Network
No ratings yet
CS414-Lesson 07. Recurrent Neural Network
23 pages
Application of Natural Language Processing in HAZOP Reports
No ratings yet
Application of Natural Language Processing in HAZOP Reports
8 pages
Overhead View Person Detection Using YOLO
No ratings yet
Overhead View Person Detection Using YOLO
7 pages
Takeoff Edu Group Matlab Title List
No ratings yet
Takeoff Edu Group Matlab Title List
4 pages
Report Rahul
No ratings yet
Report Rahul
26 pages
22BCS16084 Divyansh
No ratings yet
22BCS16084 Divyansh
2 pages
ML Module I
No ratings yet
ML Module I
71 pages
Deep Learning Architectures Cheatsheet
No ratings yet
Deep Learning Architectures Cheatsheet
3 pages
Emotion Detection From Text Using NLP HCL 2025
No ratings yet
Emotion Detection From Text Using NLP HCL 2025
12 pages
Deep Learning Mst3
No ratings yet
Deep Learning Mst3
2 pages

Module 3 Supervised ML Algo

Uploaded by

Module 3 Supervised ML Algo

Uploaded by

Module 3

• P(A|B) is Posterior probability: Probability of hypothesis A on the observed event

• X=(age=young, income=medium, student=yes, credit_rating=fair)

• Decision Trees usually mimic human thinking ability while making a

• Entropy: Entropy is a metric to measure the impurity in a given attribute. It

• Gini index is a measure of impurity or purity used while creating a

• Calculate Gini for sub-nodes, using formula sum of square of

• Disadvantages of the Decision Tree

• A linear line showing the relationship between the dependent and

• Negative Linear Relationship:

b2= [(Σx12)(Σx2y) – (Σx1x2)(Σx1y)] / [(Σx12) (Σx22) – (Σx1x2)2]

You might also like