0% found this document useful (0 votes)

28 views74 pages

Classification

Uploaded by

Suraj Yadav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views74 pages

Classification

Uploaded by

Suraj Yadav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 74

CLASSIFICATION

Instructor
Dr. Prashant Srivastava
INTRODUCTION
 Classification is a type of supervised learning
where a target feature, which is of categorical
type, is predicted for the test data on the basis
of information imparted by the training data.

 The target categorical feature is known as

class.
CLASSIFICATION
LEARNING STEPS
COMMON CLASSIFICATION
ALGORITHMS
 k-nearest neighbour
 Logistic regression
 Decision tree
 Random forest
 Support Vector Machine
 Naïve Bayes classifier
k-NEAREST NEIGHBOUR
 A simple but extremely powerful classification
algorithm.
k-NEAREST NEIGHBOUR
 The unknown and unlabelled data which comes
for a prediction problem is judged on the basis
of the training dataset elements which are
similar to the unknown element.

 The class labels of unknown element is

assigned on the basis of class labels of the
similar training dataset elements.
k-NEAREST NEIGHBOUR
 Consider a Student dataset.
k-NEAREST NEIGHBOUR
1. Leader- students having good communication
skills as well as a good level of aptitude.

2. Speaker- students having good

communication skills but not so good level of
aptitude.

3. Intel- students having not so good

communication skill but a good level of
aptitude.
k-NEAREST NEIGHBOUR
 Performance of the classification model is
measured by the number of correct
classifications made by the model when applied
to an unknown dataset.

 If the class value predicted for most of the test

data elements matches with the actual class
value that they have, then we say that the
classification model possesses a good accuracy.
k-NEAREST NEIGHBOUR
k-NEAREST NEIGHBOUR
 Two challenges-

1. What is the basis of similarity?

2. How many similar elements should be

considered for deciding the class label of each
test data element?
k-NEAREST NEIGHBOUR
 Measures of similarity.

 The most common approach adopted by kNN to

measure similarity between two data elements
is Euclidean distance.
k-NEAREST NEIGHBOUR
k-NEAREST NEIGHBOUR
 To find out the closest or nearest
neighbours of the test data point,
Euclidean distance of the different dots
need to be calculated from the asterisk.

 Then, the class value of the closest

neighbours helps in assigning the class
value of the test data element.
k-NEAREST NEIGHBOUR
 How many similar elements should be
considered?

 The answer lies in the value of ‘k’ which is a

user-defined parameter given as input to the
algorithm.

 In kNN algorithm, the value of ‘k’ indicates the

number of neighbours that need to be
considered.
k-NEAREST NEIGHBOUR
 For example, if the value of k is 3, only three
nearest neighbours or three training data
elements closest to the test data element is
considered.

 Out of the three data elements, the class which

is predominant is considered as the class label
to be assigned to the test data.
k-NEAREST NEIGHBOUR
k-NEAREST NEIGHBOUR
Choosing value of ‘k’
 If the value of k is very large, the class label
of the majority class of the training data set will
be assigned to the test data regardless of the
class labels of the neighbours nearest to the
test data.

 If the value of k is very small, the class

value of a noisy data or outlier in the training
dataset which is nearest neighbour to the test
data will be assigned to the test data.
k-NEAREST NEIGHBOUR
 Then, what should be ‘k’?

 The best value of ‘k’ is somewhere between

these two extremes.
k-NEAREST NEIGHBOUR
1. One common practice- set k equal to the
square root of the number of training records.

2. Test several ‘k’ values on a variety of test

datasets and choose the one that delivers the
best performance.
k-NEAREST NEIGHBOUR
Algorithm-
1. Predict a class value for new data: Calculate
distance(X, Xi) from i=1,2,3,….,n.
where X= new data point, Xi= training data,
distance as per your chosen distance metric.
2. Sort these distances in increasing order with
corresponding train data.
3. From this sorted list, select the top ‘K’ rows.
4. Find the most frequent class from these chosen
‘K’ rows. This will be your predicted class.
k-NEAREST NEIGHBOUR
Strengths-
1. Simple and easy to understand.

2. Effective in certain situations.

3. Almost no time required for training phase.

k-NEAREST NEIGHBOUR
Weakness-
1. Does not learn anything in real sense.

2. Classification process is very slow.

3. Large amount of computational space is

required.
k-NEAREST NEIGHBOUR
 k-nearest neighbour is a ‘lazy learner’.

 What is lazy learning?

k-NEAREST NEIGHBOUR
 Lazy learning is a type of machine learning that
doesn't process training data until it needs to
make a prediction.

 Instead of building models during training, lazy

learning algorithms wait until they encounter a
new query.
k-NEAREST NEIGHBOUR
 This
method stores and compares training
examples when making predictions.

 It's also called instance-based or memory-based

learning.
k-NEAREST NEIGHBOUR
 Lazy learning algorithms work by memorizing
the training data rather than constructing a
general model.

 When a new query is received, lazy learning

retrieves similar instances from the training set
and uses them to generate a prediction.

 The similarity between instances is usually

calculated using distance metrics.
LOGISTIC REGRESSION
 A type of regression analysis used for predicting
the outcome of a categorical variable.

 In logistic regression, dependant variable (Y) is

binary (0,1) and independent variables (X) are
continuous in nature.
LOGISTIC REGRESSION
 The goal of logistic regression is to predict the
likelihood that Y = 1 given certain value of X.

 IfX and Y have a strong positive linear

relationship, the probability that a person will
have a score of Y = 1 will increase as the values
of X increase.

 So, we are predicting probabilities rather than

the scores of dependent variable.
LOGISTIC REGRESSION
 It’s essential to emphasize that logistic
regression is not just a classification algorithm;
it’s a method for estimating probabilities.

 Logistic regression is a powerful classification

technique which estimates the likelihood of an
input belonging to a particular class.

 This estimation is inherently a probability

prediction, which must be converted into binary
values (0 or 1) to make class predictions.
HOW LOGISTIC
REGRESSION WORKS?
 Logistic Regression algorithm works by
implementing a linear equation with
independent or explanatory variables to predict
a response value.
 For example, we consider the example of
number of hours studied and probability of
passing the exam.
 Here, number of hours studied is the
explanatory variable and it is denoted by x1.
 Probability of passing the exam is the response
or target variable and it is denoted by z.
HOW LOGISTIC
REGRESSION WORKS?
 If we have one explanatory variable (x) and one
response variable (z), then the linear equation
would be given mathematically with the
following equation-

z = a + bx
SIGMOID FUNCTION
 This predicted response value, denoted by z is
then converted into a probability value that lie
between 0 and 1.

 We use the sigmoid function in order to map

predicted values to probability values.

 This sigmoid function then maps any real value

into a probability value between 0 and 1.
SIGMOID FUNCTION
 Sigmoid function is used to map predictions to
probabilities. The sigmoid function has an S
shaped curve.

 It is also called sigmoid curve.

 A Sigmoid function is a special case of the

Logistic function.
SIGMOID FUNCTION
DECISION BOUNDARY
 The sigmoid function returns a probability value
between 0 and 1.

 This probability value is then mapped to a

discrete class which is either “0” or “1”.
DECISION BOUNDARY
 In order to map this probability value to a
discrete class (pass/fail, yes/no, true/false), we
select a threshold value.

 Thisthreshold value is called Decision

boundary.

 Above this threshold value, we will map the

probability values into class 1 and below which
we will map values into class 0.
DECISION BOUNDARY
Mathematically, it can be expressed as follows:-

p ≥ 0.5 => class = 1

p < 0.5 => class = 0
DECISION BOUNDARY
 Generally, the decision boundary is set to 0.5.

 So, if the probability value is 0.8 (> 0.5), we will

map this observation to class 1.

 Similarly, if the probability value is 0.2 (< 0.5),

we will map this observation to class 0.
DECISION BOUNDARY
SUPPORT VECTOR
MACHINE
 Support Vector Machine (SVM) is a model,
which can do linear classification as well as
regression.

 SVM is based on the concept of a surface,

called a hyperplane, which draws a boundary
between data instances plotted in the
multidimensional feature space.
SUPPORT VECTOR
MACHINE
 The output prediction of an SVM is one of the
two conceivable classes which are already
defined in the training data.

 SVM algorithm builds an N-dimensional

hyperplane model that assigns future instances
into one of the two possible output classes.
SUPPORT VECTOR
MACHINE
 The goal of the SVM algorithm is to create the
best line or decision boundary that can
segregate n-dimensional space into classes so
that we can easily put the new data point in the
correct category in the future.

 Thisbest decision boundary is called a

hyperplane.
SUPPORT VECTOR
MACHINE
SUPPORT VECTOR
MACHINE
 There may be many possible hyperplanes.

 One of the challenges with the SVM model is to

find the optimal hyperplane.
SUPPORT VECTOR
MACHINE
SUPPORT VECTOR
MACHINE
 There can be multiple hyperplanes to segregate
the classes in n-dimensional space, but we
need to find out the best hyperplane that helps
to classify the data points.

 We always create a hyperplane that has a

maximum margin, which means the maximum
distance between the data points.
SUPPORT VECTOR
MACHINE
 The data points or vectors that are the closest
to the hyperplane and which affect the position
of the hyperplane are termed as Support
Vector.

 Since these vectors support the hyperplane,

hence called a support vector.
SUPPORT VECTOR
MACHINE
SUPPORT VECTOR
MACHINE
Hyperplane and Margin-
 Mathematically, in a two dimensional space,
hyperplane can be defined by
SUPPORT VECTOR
MACHINE
 Extending this concept to an N-dimensional
space, hyperplane can be defined by the
equation

 In short, it can be represented as follows-

SUPPORT VECTOR
MACHINE
 The farther the data points lie from the
hyperplane, the more confident we are about
correct categorization.

 The distance between hyperplane and data

points is known as margin.
SUPPORT VECTOR
MACHINE
Maximum Margin Hyperplane-
 Refers to identifying the hyperplane which has
the largest separation with the data instances
of the two classes.

 Though any set of three hyperplanes can do the

correct classification, why do we need to search
for the set of hyperplanes causing the largest
separation?
SUPPORT VECTOR
MACHINE
 The answer is that doing so helps us in
achieving more generalization and hence less
number of issues in the classification of
unknown data.
SUPPORT VECTOR
MACHINE
Identifying MMH for linearly separable data
 An outer boundary needs to be drawn for the
data instances belonging to different classes.

 These outer boundaries are known as convex

hull.

 The MMH can be drawn as the perpendicular

bisector of the shortest line between convex
hull.
SUPPORT VECTOR
MACHINE
SUPPORT VECTOR
MACHINE
 Find a set of values for the vector c such that
two hyperplanes, represented by the equations
below, can be specified
SUPPORT VECTOR
MACHINE
 All the instances that belong to one class falls
above one hyperplane and all the data
instances belonging to another class falls below
another hyperplane.

 According to vector geometry, the distance of

these planes should be
SUPPORT VECTOR
MACHINE
 In order to maximize the distance between
hyperplanes, the value of vector c should be
minimised.

 The task of SVM is to solve the optimization

problem
SUPPORT VECTOR
MACHINE
Strengths-
1. Can be used for both classification and
regression

2. Not much impacted by data with noise or

outlier

3. Prediction using this model is promising

SUPPORT VECTOR
MACHINE
Weaknesses-
1. Applicable only for binary classification

2. Slow for large datasets

3. Quite memory-intensive
DECISION TREE
 One of the most widely adopted algorithms for
classification.

 It builds model in the form of tree structure.

 Exceptionally productive.
DECISION TREE
 Used for multidimensional analysis with
multiple classes.

 The goal of decision tree learning is to create a

model based on past data called past vector
that predicts the value of the output variable
based on input variables in the feature vector.
DECISION TREE
 Each node of a decision tree corresponds to one
of the feature vector.

 For every node there are edges to children,

wherein there is an edge for each of the
possible values (or range of values) of the
feature associated with the node.
DECISION TREE
 The tree terminates at different leaf nodes
where each leaf node represents a possible
value for the output variable.

 The output variable is determined by following

a path that starts at the root and is guided by
the values of the input variables.
BUILDING A DECISION
TREE
 It starts from the root node, which is nothing
but the entire dataset.

 It selects the feature which predicts the target

class in the strongest way.
BUILDING A DECISION
TREE
 The decision tree splits the dataset into multiple
partitions, with data in each partition having a
distinct value for the feature based on which
the partitioning has happened.

 This is the first set of branches.

BUILDING A DECISION
TREE
 Likewise, the algorithm continues splitting the
nodes on the basis of the feature which helps in
the best partition.

 This continues till a stopping criteria is reached.

BUILDING A DECISION
TREE
 The usual stopping criteria are-

1. All or most of the examples at a particular

node have the same class
2. All features have been used up in the
partitioning
3. The tree has grown to a pre-defined
threshold limit
BUILDING A DECISION
TREE
 There are many implementations of decision
tree.

 The biggest challenge of a decision tree

algorithm is to find out which feature to split
upon.
BUILDING A DECISION
TREE
 Data should be split in such a way that the
partitions created by the split should contain
examples belonging to a single class.

 If that happens, the partitions are considered as

pure.
BUILDING A DECISION
TREE
 Entropy is the measure of impurity of an
attribute or feature.

 The information gain is calculated on the

basis of decrease in entropy (S) after a dataset
is split according to a particular attribute (A).
BUILDING A DECISION
TREE
 Constructing a tree is all about finding an
attribute that returns the highest
information gain.
REFERENCES
1. Ethem Alpaydin. Machine learning. MIT press,
2021.

2. S Dutt, S Chandramouli, A K Das. Machine

Learning. Pearson, 2022.

Machine Learning for Engineering Students
No ratings yet
Machine Learning for Engineering Students
31 pages
Session 5
No ratings yet
Session 5
36 pages
ML Unit-2 (CEC)
No ratings yet
ML Unit-2 (CEC)
96 pages
ML Unit 3
No ratings yet
ML Unit 3
12 pages
Module 3
No ratings yet
Module 3
63 pages
ML Algorithms Week 3
No ratings yet
ML Algorithms Week 3
30 pages
Mlfa Autumn 22 Lec 03
No ratings yet
Mlfa Autumn 22 Lec 03
61 pages
Week 4 v1.1 (Hidden) - Supervised Learning (Classification)
No ratings yet
Week 4 v1.1 (Hidden) - Supervised Learning (Classification)
43 pages
Datamining Lect12
No ratings yet
Datamining Lect12
75 pages
Jntuk R20 ML Unit-Ii
No ratings yet
Jntuk R20 ML Unit-Ii
37 pages
Machine Learning
No ratings yet
Machine Learning
35 pages
Unit 3 in Machine Intelligence
No ratings yet
Unit 3 in Machine Intelligence
62 pages
Machine Learning Basics for Beginners
No ratings yet
Machine Learning Basics for Beginners
28 pages
Data Science Interview - 1
No ratings yet
Data Science Interview - 1
32 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
Data Mining Lecture 10B: Classification
No ratings yet
Data Mining Lecture 10B: Classification
62 pages
Data Mining: Classification
No ratings yet
Data Mining: Classification
79 pages
Tutorial 7 Machine Learning Algorithms
No ratings yet
Tutorial 7 Machine Learning Algorithms
30 pages
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
No ratings yet
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
17 pages
Lesson 8 - Classification
No ratings yet
Lesson 8 - Classification
74 pages
Module 3
No ratings yet
Module 3
25 pages
Aiml K2
No ratings yet
Aiml K2
8 pages
Lesson 4 - Supervised Learning
No ratings yet
Lesson 4 - Supervised Learning
36 pages
Module 3
No ratings yet
Module 3
101 pages
Classification and Regression
No ratings yet
Classification and Regression
34 pages
Lecture - 2 & 3
No ratings yet
Lecture - 2 & 3
62 pages
Machine Learning Theory
100% (1)
Machine Learning Theory
12 pages
Module 3 Intro
No ratings yet
Module 3 Intro
46 pages
Classification (NaiveBayes KNN SVM DecisionTrees)
No ratings yet
Classification (NaiveBayes KNN SVM DecisionTrees)
105 pages
ML Introduction
No ratings yet
ML Introduction
76 pages
Classification
No ratings yet
Classification
7 pages
Unit 2 ML
No ratings yet
Unit 2 ML
89 pages
Accelerated Data Science Introduction To Machine Learning Algorithms
No ratings yet
Accelerated Data Science Introduction To Machine Learning Algorithms
37 pages
Unit 1
No ratings yet
Unit 1
15 pages
Presentation ML 1
No ratings yet
Presentation ML 1
67 pages
Introduction To Machine Learning and Logistic Regression
No ratings yet
Introduction To Machine Learning and Logistic Regression
28 pages
Supervised Learning & KNN Guide
No ratings yet
Supervised Learning & KNN Guide
27 pages
ML Unit-IV Notes
No ratings yet
ML Unit-IV Notes
49 pages
Week 8. Supervised Learning. Classification
No ratings yet
Week 8. Supervised Learning. Classification
45 pages
Chapter
100% (1)
Chapter
101 pages
Classification
No ratings yet
Classification
50 pages
Slide 2 ML Basics
No ratings yet
Slide 2 ML Basics
42 pages
Unit 5
No ratings yet
Unit 5
73 pages
Supervised Classification Notes
No ratings yet
Supervised Classification Notes
31 pages
DSV Ia2
No ratings yet
DSV Ia2
18 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Machine Lar Arii
No ratings yet
Machine Lar Arii
9 pages
Lect 1
No ratings yet
Lect 1
24 pages
Machine Learning
No ratings yet
Machine Learning
33 pages
ML CH 3
No ratings yet
ML CH 3
88 pages
Slide 2
No ratings yet
Slide 2
30 pages
AI & ML Classification Lecture
No ratings yet
AI & ML Classification Lecture
69 pages
ML Unit 2 Possible Questions and Answers
No ratings yet
ML Unit 2 Possible Questions and Answers
48 pages
Machine Learning Overview Guide
No ratings yet
Machine Learning Overview Guide
68 pages
DWDM PPT
No ratings yet
DWDM PPT
35 pages
Unit 4 Supervised Learning
100% (1)
Unit 4 Supervised Learning
75 pages
ML - MU - Unit - 2 - Supervised Learning-Classification Techniques
No ratings yet
ML - MU - Unit - 2 - Supervised Learning-Classification Techniques
153 pages
(English (Auto-Generated) ) All Machine Learning Algorithms Explained in 17 Min (DownSub - Com)
No ratings yet
(English (Auto-Generated) ) All Machine Learning Algorithms Explained in 17 Min (DownSub - Com)
19 pages
Classification
100% (2)
Classification
105 pages
Selecting The Right Thermodynamic Models For Process Simulation PDF
No ratings yet
Selecting The Right Thermodynamic Models For Process Simulation PDF
5 pages
Required Returns and The Cost of Capital Required Returns and The Cost of Capital
No ratings yet
Required Returns and The Cost of Capital Required Returns and The Cost of Capital
49 pages
Syntax Analysis for CS Students
No ratings yet
Syntax Analysis for CS Students
6 pages
Eaton Metal Seals
No ratings yet
Eaton Metal Seals
60 pages
Millikan Oil Drop Experiment
No ratings yet
Millikan Oil Drop Experiment
6 pages
Shades and Silver Dax Murray Instant Download
100% (1)
Shades and Silver Dax Murray Instant Download
28 pages
CHAPTER 3 Part 4
No ratings yet
CHAPTER 3 Part 4
11 pages
Water System Flange Dimensions
No ratings yet
Water System Flange Dimensions
10 pages
Lecture 8
No ratings yet
Lecture 8
16 pages
Sugiyono. (2016) - Metode Penelitian Pendidikan. Bandung:Alfabeta.p.116
No ratings yet
Sugiyono. (2016) - Metode Penelitian Pendidikan. Bandung:Alfabeta.p.116
9 pages
Topic 6. Other Laws
No ratings yet
Topic 6. Other Laws
15 pages
DC Circiuits
No ratings yet
DC Circiuits
52 pages
Optimization Using Calculus: Optimization of Functions of Multiple Variables: Unconstrained Optimization
No ratings yet
Optimization Using Calculus: Optimization of Functions of Multiple Variables: Unconstrained Optimization
12 pages
U1L07 - Activity Guide - Apps With Storage
No ratings yet
U1L07 - Activity Guide - Apps With Storage
2 pages
Lec 13
No ratings yet
Lec 13
18 pages
PMSM Control with 4-Switch Inverter
No ratings yet
PMSM Control with 4-Switch Inverter
5 pages
Harmonics in Power System
100% (1)
Harmonics in Power System
26 pages
Hospital Management Software Development: Olawale Ayotunde Sobogungod
No ratings yet
Hospital Management Software Development: Olawale Ayotunde Sobogungod
3 pages
Process Costing Weighted-Average Worksheet
No ratings yet
Process Costing Weighted-Average Worksheet
5 pages
PCHT
No ratings yet
PCHT
7 pages
C. Henry Edwards, David E. Penney - Differential Equations - Computing and Modeling-Pearson (2013) - 1
No ratings yet
C. Henry Edwards, David E. Penney - Differential Equations - Computing and Modeling-Pearson (2013) - 1
13 pages
Report On Gis Substation Sector
No ratings yet
Report On Gis Substation Sector
6 pages
Install Ohmw 4.01.01.rc.03
No ratings yet
Install Ohmw 4.01.01.rc.03
3 pages
Importing Data Python Cheat Sheet PDF
No ratings yet
Importing Data Python Cheat Sheet PDF
1 page
Questionnaire Performance Testing
No ratings yet
Questionnaire Performance Testing
10 pages
Philips Certaflux Led Panel 60120
No ratings yet
Philips Certaflux Led Panel 60120
11 pages
Green University of Bangladesh Department of Textile: Lab Report
No ratings yet
Green University of Bangladesh Department of Textile: Lab Report
5 pages
PHP Cookbook
75% (8)
PHP Cookbook
72 pages
Com 101
No ratings yet
Com 101
76 pages
Power Plant Steam Turbine Issues
100% (1)
Power Plant Steam Turbine Issues
10 pages

Classification

Uploaded by

Classification

Uploaded by

CLASSIFICATION

 The target categorical feature is known as

 The class labels of unknown element is

2. Speaker- students having good

3. Intel- students having not so good

 If the class value predicted for most of the test

1. What is the basis of similarity?

2. How many similar elements should be

 The most common approach adopted by kNN to

 Then, the class value of the closest

 The answer lies in the value of ‘k’ which is a

 In kNN algorithm, the value of ‘k’ indicates the

 Out of the three data elements, the class which

 If the value of k is very small, the class

 The best value of ‘k’ is somewhere between

2. Test several ‘k’ values on a variety of test

2. Effective in certain situations.

3. Almost no time required for training phase.

2. Classification process is very slow.

3. Large amount of computational space is

 What is lazy learning?

 Instead of building models during training, lazy

 It's also called instance-based or memory-based

 When a new query is received, lazy learning

 The similarity between instances is usually

 In logistic regression, dependant variable (Y) is

 IfX and Y have a strong positive linear

 So, we are predicting probabilities rather than

 Logistic regression is a powerful classification

 This estimation is inherently a probability

 We use the sigmoid function in order to map

 This sigmoid function then maps any real value

 It is also called sigmoid curve.

 A Sigmoid function is a special case of the

 This probability value is then mapped to a

 Thisthreshold value is called Decision

 Above this threshold value, we will map the

p ≥ 0.5 => class = 1

 So, if the probability value is 0.8 (> 0.5), we will

 Similarly, if the probability value is 0.2 (< 0.5),

 SVM is based on the concept of a surface,

 SVM algorithm builds an N-dimensional

 Thisbest decision boundary is called a

 One of the challenges with the SVM model is to

 We always create a hyperplane that has a

 Since these vectors support the hyperplane,

 In short, it can be represented as follows-

 The distance between hyperplane and data

 Though any set of three hyperplanes can do the

 These outer boundaries are known as convex

 The MMH can be drawn as the perpendicular

 According to vector geometry, the distance of

 The task of SVM is to solve the optimization

2. Not much impacted by data with noise or

3. Prediction using this model is promising

2. Slow for large datasets

 It builds model in the form of tree structure.

 The goal of decision tree learning is to create a

 For every node there are edges to children,

 The output variable is determined by following

 It selects the feature which predicts the target

 This is the first set of branches.

 This continues till a stopping criteria is reached.

1. All or most of the examples at a particular

 The biggest challenge of a decision tree

 If that happens, the partitions are considered as

 The information gain is calculated on the

2. S Dutt, S Chandramouli, A K Das. Machine

You might also like