0% found this document useful (0 votes)

4 views23 pages

Classification Algorithms

Classification Algorithms in AI

Uploaded by

khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views23 pages

Classification Algorithms

Classification Algorithms in AI

Uploaded by

khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 23

Classification Algorithms

 Linear Regression
 Decision Tree Induction
 Bayes Classification Methods
 Rule-Based Classification
 Model Evaluation and Selection
 Techniques to Improve Classification Accuracy:
Ensemble Methods
 Summary
1
Supervised vs. Unsupervised
Learning

 Supervised learning (classification)

 Supervision: The training data (observations,
measurements, etc.) are accompanied by labels indicating
the class of the observations
 New data is classified based on the training set
 Unsupervised learning (clustering)
 The class labels are unknown
 Given a set of measurements, observations, etc. with the
aim of establishing the existence of classes or clusters in
the data
2
Prediction Problems:
Classification vs. Numeric
Prediction
 Classification
 predicts categorical class labels (discrete or nominal)
 classifies data (constructs a model) based on the training
set and the values (class labels) in a classifying attribute
and uses it in classifying new data
 Numeric Prediction
 models continuous-valued functions, i.e., predicts unknown
or missing values

3
Classification—A Two-Step
Process
 Model construction: describing a set of predetermined classes
 Each tuple/sample is assumed to belong to a predefined class, as determined by
the class label attribute
 The set of tuples used for model construction is training set
 The model is represented as classification rules, decision trees, or mathematical
formulae
 Model usage: for classifying future or unknown objects
 Estimate accuracy of the model
 The known label of test sample is compared with the classified result from
the model
 Accuracy rate is the percentage of test set samples that are correctly
classified by the model
 Test set is independent of training set (otherwise overfitting)
 If the accuracy is acceptable, use the model to classify new data
 Note: If the test set is used to select models, it is called validation (test) set

4
Process (1): Model
Construction

Classification
Algorithms
Training
Data

NAME RANK YEARS TENURED Classifier

Mike Assistant Prof 3 no (Model)
Mary Assistant Prof 7 yes
Bill Professor 2 yes
Jim Associate Prof 7 yes IF rank = ‘professor’
Dave Assistant Prof 6 no
OR years > 6
Anne Associate Prof 3 no
THEN tenured = ‘yes’
5
Process (2): Using the Model in
Prediction

Classifier

Testing
Data Unseen Data

(Jeff, Professor, 4)
NAME RANK YEARS TENURED
Tom Assistant Prof 2 no Tenured?
Merlisa Associate Prof 7 no
George Professor 5 yes
Joseph Assistant Prof 7 yes
6
Chapter 8. Classification: Basic
Concepts
 Classification: Basic Concepts
 Decision Tree Induction
 Bayes Classification Methods
 Rule-Based Classification
 Model Evaluation and Selection
 Techniques to Improve Classification Accuracy:
Ensemble Methods
 Summary
7
Decision Tree Induction: An
Example
age income student credit_rating buys_computer
<=30 high no fair no
Training data set: Buys_computer <=30 high no excellent no
Resulting tree: 31…40 high no fair yes
>40 medium no fair yes
>40 low yes fair yes
>40 low yes excellent no
31…40 low yes excellent yes
age? <=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
<=30 overcast
31..40 >40
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no

student? yes credit rating?

no yes excellent fair

no yes no yes
8
Algorithm for Decision Tree
Induction
 Basic algorithm (a greedy algorithm)
 Tree is constructed in a top-down recursive divide-and-conquer
manner
 At start, all the training examples are at the root
 Attributes are categorical (if continuous-valued, they are discretized
in advance)
Examples are Examples partitioned recursively based on selected attributes
 Test attributes are selected on the basis of a heuristic or statistical
measure (e.g., information gain)
 Conditions for stopping partitioning
 All samples for a given node belong to the same class
 There are no remaining attributes for further partitioning
 There are no samples left

9
Brief Review of Entropy

10
Attribute Selection Measure:
Information Gain (ID3/C4.5)

 Expected information (entropy) needed to classify a tuple in D:

 Information needed (after using A to split D into v partitions) to
m
classify D:
Info( D )=−∑ pi log 2 ( pi )
i= 1
 Information gained by branching on attribute A

v
|D j|
Info A ( D )=∑ × Info( D j )
j= 1 |D|

G ain ( A )= Info ( D )− Info A ( D )

11
Attribute Selection Measure:
Information Gain (ID3/C4.5)

 Expected information (entropy) needed to classify a tuple in D:

 Information needed (after using A to split
m D into v partitions) to
classify D: Info( D )=−∑ pi log 2 ( pi )
i= 1

 Information gained by branching on attribute A

v
|D j|
Info A ( D )=∑ × Info( D j )
j= 1 |D|

G ain ( A )= Info ( D )− Info A ( D )

12
Attribute Selection: Information
Gain
 Class P: buys_computer = “yes” 5 4
Info age ( D )= I (2,3 )+ I ( 4,0)
 Class N: buys_computer = “no” 14 14
5
+ I (3,2)=0 . 694
9 9 5 5 14
Info( D )=I (9,5 )=− log 2 ( )− log 2 ( )=0 . 940
14 14 14 14
5
age income student credit_rating buys_computer I (2,3 ) means “age <=30” has 5 out of
<=30 high no fair no 14
14 samples, with 2 yes’es and 3
<=30 high no excellent no
31…40 high no fair yes no’s. Hence
>40 medium no fair yes
>40 low yes fair yes G ain ( age )= Info ( D )− Info age ( D )= 0 . 246
>40 low yes excellent no
31…40 low yes excellent yes Similarly,
<=30 medium no fair no
<=30 low yes fair yes Gain( income )=0 . 029
>40 medium yes fair yes
<=30 medium yes excellent yes Gain( student )=0 . 151
31…40 medium no excellent yes Gain( credit rating )=0 . 048
31…40 high yes fair yes
>40 medium no excellent no 13
Overfitting and Tree Pruning
 Overfitting: An induced tree may overfit the training data
 Too many branches, some may reflect anomalies due to noise or
outliers
 Poor accuracy for unseen samples
 Two approaches to avoid overfitting
 Prepruning: Halt tree construction early ̵ do not split a node if this
would result in the goodness measure falling below a threshold
 Difficult to choose an appropriate threshold
 Postpruning: Remove branches from a “fully grown” tree—get a
sequence of progressively pruned trees
 Use a set of data different from the training data to decide
which is the “best pruned tree”

14
Chapter 8. Classification: Basic
Concepts
 Classification: Basic Concepts
 Decision Tree Induction
 Bayes Classification Methods
 Rule-Based Classification
 Model Evaluation and Selection
 Techniques to Improve Classification Accuracy:
Ensemble Methods
 Summary
15
Bayesian Classification:
Why?
 A statistical classifier: performs probabilistic prediction, i.e., predicts
class membership probabilities
 Foundation: Based on Bayes’ Theorem.
 Performance: A simple Bayesian classifier, naïve Bayesian classifier,
has comparable performance with decision tree and selected neural
network classifiers
 Incremental: Each training example can incrementally
increase/decrease the probability that a hypothesis is correct — prior
knowledge can be combined with observed data
 Standard: Even when Bayesian methods are computationally
intractable, they can provide a standard of optimal decision making
against which other methods can be measured

16
Bayes’ Theorem: Basics
M
 Total probability Theorem: P( B )=∑ P (B|A i )P ( A i )
i=1

P ( X |H ) P ( H )
 Bayes’ Theorem: P ( H |X )= =P ( X |H )× P ( H )/ P ( X )
P( X )

 Let X be a data sample (“evidence”): class label is unknown

 Let H be a hypothesis that X belongs to class C
 Classification is to determine P(H|X), (i.e., posteriori probability): the
probability that the hypothesis holds given the observed data sample X
 P(H) (prior probability): the initial probability
 E.g., X will buy computer, regardless of age, income, …
 P(X): probability that sample data is observed
 P(X|H) (likelihood): the probability of observing the sample X, given that
the hypothesis holds
 E.g., Given that X will buy computer, the prob. that X is 31..40,
medium income
17
Prediction Based on Bayes’
Theorem
 Given training data X, posteriori probability of a hypothesis H,
P(H|X), follows the Bayes’ theorem
P ( X |H ) P ( H )
P ( H |X )= =P ( X |H )× P ( H )/ P ( X )
P( X )

 Informally, this can be viewed as

posteriori = likelihood x prior/evidence
 Predicts X belongs to Ci iff the probability P(Ci|X) is the highest
among all the P(Ck|X) for all the k classes
 Practical difficulty: It requires initial knowledge of many
probabilities, involving significant computational cost
18
Classification Is to Derive the Maximum
Posteriori
 Let D be a training set of tuples and their associated class labels,
and each tuple is represented by an n-D attribute vector X = (x1,
x2, …, xn)
 Suppose there are m classes C1, C2, …, Cm.
 Classification is to derive the maximum posteriori, i.e., the
maximal P(Ci|X)
 This can be derived from Bayes’ theorem

P( X|C i ) P(C i )
P( C i|X )=
P( X )
 Since P(X) is constant for all classes, only

needs to be maximized
P ( C i|X )=P ( X |C i ) P ( C i )
19
Naïve Bayes Classifier: Training
Dataset
age income studentcredit_rating
buys_compu
<=30 high no fair no
Class: <=30 high no excellent no
C1:buys_computer = ‘yes’ 31…40 high no fair yes
C2:buys_computer = ‘no’ >40 medium no fair yes
>40 low yes fair yes
>40 low yes excellent no
Data to be classified:
31…40 low yes excellent yes
X = (age <=30,
<=30 medium no fair no
Income = medium, <=30 low yes fair yes
Student = yes >40 medium yes fair yes
Credit_rating = Fair) <=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
20
Naïve Bayes Classifier: An
Example
 P(Ci): P(buys_computer = “yes”) = 9/14 = 0.643 age
<=30
income studentcredit_rating
high no fair
buys_computer
no
<=30 high no excellent no
P(buys_computer = “no”) = 5/14= 0.357 31…40
>40
high
medium
no fair
no fair
yes
yes

 Compute P(X|Ci) for each class >40

>40
low
low
yes fair
yes excellent
yes
no
31…40 low yes excellent yes

P(age = “<=30” | buys_computer = “yes”) = 2/9 = 0.222 <=30

<=30
medium
low
no fair
yes fair
no
yes
>40 medium yes fair yes
P(age = “<= 30” | buys_computer = “no”) = 3/5 = 0.6 <=30 medium yes excellent yes
31…40 medium no excellent yes
P(income = “medium” | buys_computer = “yes”) = 4/9 = 0.444 31…40
>40
high
medium
yes fair
no excellent
yes
no

P(income = “medium” | buys_computer = “no”) = 2/5 = 0.4

P(student = “yes” | buys_computer = “yes) = 6/9 = 0.667
P(student = “yes” | buys_computer = “no”) = 1/5 = 0.2
P(credit_rating = “fair” | buys_computer = “yes”) = 6/9 = 0.667
P(credit_rating = “fair” | buys_computer = “no”) = 2/5 = 0.4
 X = (age <= 30 , income = medium, student = yes, credit_rating = fair)
P(X|Ci) : P(X|buys_computer = “yes”) = 0.222 x 0.444 x 0.667 x 0.667 = 0.044
P(X|buys_computer = “no”) = 0.6 x 0.4 x 0.2 x 0.4 = 0.019
P(X|Ci)*P(Ci) : P(X|buys_computer = “yes”) * P(buys_computer = “yes”) = 0.028
P(X|buys_computer = “no”) * P(buys_computer = “no”) = 0.007
Therefore, X belongs to class (“buys_computer = yes”)

21
X = (age <= 30 , income = medium, student = yes, credit_rating = fair)
P(X|Ci) : P(X|buys_computer = “yes”) = 0.222 x 0.444 x 0.667 x 0.667 = 0.044
P(X|buys_computer = “no”) = 0.6 x 0.4 x 0.2 x 0.4 = 0.019
P(X|Ci)*P(Ci) : P(X|buys_computer = “yes”) * P(buys_computer = “yes”) = 0.028
P(X|buys_computer = “no”) * P(buys_computer = “no”) = 0.007
Therefore, X belongs to class (“buys_computer = yes”)
Comparison of Decision Tree and
Baysian Classification

10 Classification New 1
No ratings yet
10 Classification New 1
31 pages
DMDW 11 Classification Basic
No ratings yet
DMDW 11 Classification Basic
41 pages
4 22865 IS465 2019 1 2 1 08ClassBasic
No ratings yet
4 22865 IS465 2019 1 2 1 08ClassBasic
43 pages
L11 Slides
No ratings yet
L11 Slides
28 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
17 pages
Unit 3 Machine Learning
No ratings yet
Unit 3 Machine Learning
159 pages
Unit 6 Classification and Prediction
No ratings yet
Unit 6 Classification and Prediction
66 pages
Chap4 Classification Lecture 5
No ratings yet
Chap4 Classification Lecture 5
74 pages
Classification Techniques Overview
No ratings yet
Classification Techniques Overview
141 pages
Classification
No ratings yet
Classification
33 pages
08 Class Basic
No ratings yet
08 Class Basic
103 pages
05classification Rule Mining
No ratings yet
05classification Rule Mining
56 pages
CH 8 Data Mining
No ratings yet
CH 8 Data Mining
30 pages
Session 5
No ratings yet
Session 5
91 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
88 pages
Classification Ppts 2021
No ratings yet
Classification Ppts 2021
80 pages
Unit 4
No ratings yet
Unit 4
186 pages
UNIT-5 DWM
No ratings yet
UNIT-5 DWM
73 pages
TTDS Lecture 4
No ratings yet
TTDS Lecture 4
31 pages
Classification & Prediction Techniques
No ratings yet
Classification & Prediction Techniques
71 pages
Classification and Prediction
No ratings yet
Classification and Prediction
130 pages
19-Introduction Classification Algorithm-18-09-2024
No ratings yet
19-Introduction Classification Algorithm-18-09-2024
102 pages
Classification & Prediction: - Shailesh Yadav Central University of Rajasthan
No ratings yet
Classification & Prediction: - Shailesh Yadav Central University of Rajasthan
28 pages
WINSEM2024-25 BCSE334L TH VL2024250502042 2025-03-03 Reference-Material-I
No ratings yet
WINSEM2024-25 BCSE334L TH VL2024250502042 2025-03-03 Reference-Material-I
18 pages
Unit V - Classification and Prediction 2020-21
100% (1)
Unit V - Classification and Prediction 2020-21
68 pages
UNIT - IV
No ratings yet
UNIT - IV
169 pages
CH 5
No ratings yet
CH 5
84 pages
Class Basic
No ratings yet
Class Basic
67 pages
8 Classification
No ratings yet
8 Classification
45 pages
05 Classification
No ratings yet
05 Classification
79 pages
Chapter 5 Classification
No ratings yet
Chapter 5 Classification
24 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
83 pages
Unit 4 Classification
No ratings yet
Unit 4 Classification
87 pages
Unit 3-Classification
No ratings yet
Unit 3-Classification
71 pages
Module - 4.1-DM-1
No ratings yet
Module - 4.1-DM-1
63 pages
Naive Bayes
No ratings yet
Naive Bayes
37 pages
Classification - Basic Concepts
No ratings yet
Classification - Basic Concepts
35 pages
Week 6 - 7 - Classification
No ratings yet
Week 6 - 7 - Classification
67 pages
Classification
100% (1)
Classification
37 pages
Unit 4 DM
No ratings yet
Unit 4 DM
88 pages
Concepts and Techniques
No ratings yet
Concepts and Techniques
53 pages
Unit Iv
No ratings yet
Unit Iv
38 pages
Data Mining: Classification
No ratings yet
Data Mining: Classification
70 pages
Week 5
No ratings yet
Week 5
72 pages
Classification DMKD
No ratings yet
Classification DMKD
50 pages
Classification and Prediction
100% (1)
Classification and Prediction
31 pages
Module 4
No ratings yet
Module 4
99 pages
Classification
No ratings yet
Classification
73 pages
41 j48 Naive Bayes Weka
No ratings yet
41 j48 Naive Bayes Weka
5 pages
Classification
No ratings yet
Classification
45 pages
7 Classification
100% (3)
7 Classification
63 pages
DWDM Unit-3: What Is Classification? What Is Prediction?
No ratings yet
DWDM Unit-3: What Is Classification? What Is Prediction?
12 pages
Module 3
No ratings yet
Module 3
132 pages
Data Mining: Concepts and Techniques: - Chapter 7
No ratings yet
Data Mining: Concepts and Techniques: - Chapter 7
61 pages
Classification-1
No ratings yet
Classification-1
48 pages
Angelica (Angelica Archangelica)
No ratings yet
Angelica (Angelica Archangelica)
10 pages
Unit 1
No ratings yet
Unit 1
32 pages
Analysis of Training Evaluation Process Using Kirkpatrick'S Training Evaluation Model at Pt. Bank Tabungan Negara (Persero) TBK
No ratings yet
Analysis of Training Evaluation Process Using Kirkpatrick'S Training Evaluation Model at Pt. Bank Tabungan Negara (Persero) TBK
10 pages
Đề Kiểm tra cuối HK 2 Môn Tiếng Anh 6 năm học 2023-2024
No ratings yet
Đề Kiểm tra cuối HK 2 Môn Tiếng Anh 6 năm học 2023-2024
2 pages
Differences Between The NPV Vs IRR Vs PB Vs PI Vs ARR
No ratings yet
Differences Between The NPV Vs IRR Vs PB Vs PI Vs ARR
4 pages
Pastor Anita Williams Bio
No ratings yet
Pastor Anita Williams Bio
1 page
Identity Crisis in Michael Ondaatje's The English Patient
No ratings yet
Identity Crisis in Michael Ondaatje's The English Patient
3 pages
The 4 Disciplines of Execution Revised and Updated
No ratings yet
The 4 Disciplines of Execution Revised and Updated
8 pages
Halal Food
No ratings yet
Halal Food
70 pages
The Jungle Excerpt Questions and HIPP
No ratings yet
The Jungle Excerpt Questions and HIPP
5 pages
Paybooks Employee Self Service
No ratings yet
Paybooks Employee Self Service
19 pages
MP Material by Sravan
No ratings yet
MP Material by Sravan
189 pages
Ch-27.4 Plain Carbon Steel
No ratings yet
Ch-27.4 Plain Carbon Steel
11 pages
Od 226429569883076000
No ratings yet
Od 226429569883076000
2 pages
MENU KNOWLEDGE Fab 2
No ratings yet
MENU KNOWLEDGE Fab 2
7 pages
Zimbabwe School Examinations Council: Accounting 9197/3
50% (2)
Zimbabwe School Examinations Council: Accounting 9197/3
8 pages
902900-616 Despiece
No ratings yet
902900-616 Despiece
331 pages
Motorcycle EFI Tuning Guide
100% (2)
Motorcycle EFI Tuning Guide
168 pages
Plant List Manas Ayurved
No ratings yet
Plant List Manas Ayurved
23 pages
2 Bontrager
No ratings yet
2 Bontrager
1 page
What Is The Reformed Faith
No ratings yet
What Is The Reformed Faith
5 pages
Chocolate Mousetti Cookie Recipe
No ratings yet
Chocolate Mousetti Cookie Recipe
1 page
ID Book v3
No ratings yet
ID Book v3
25 pages
Personality & Components
No ratings yet
Personality & Components
2 pages
Part - A Answer The Following Questions (10x1 10)
No ratings yet
Part - A Answer The Following Questions (10x1 10)
2 pages
Core House - Neue Nationalgalarie
No ratings yet
Core House - Neue Nationalgalarie
46 pages
Aggregate & Capacity Planning Guide
100% (2)
Aggregate & Capacity Planning Guide
10 pages
Rating Scale For Student Teachers
100% (3)
Rating Scale For Student Teachers
3 pages
Mfat Action Plan Implementation Sy 2021-2022
No ratings yet
Mfat Action Plan Implementation Sy 2021-2022
1 page
Tài Liệu Không Có Tiêu Đề-2
No ratings yet
Tài Liệu Không Có Tiêu Đề-2
19 pages

Classification Algorithms

Uploaded by

Classification Algorithms

Uploaded by

Classification Algorithms

 Supervised learning (classification)

NAME RANK YEARS TENURED Classifier

student? yes credit rating?

no yes excellent fair

 Expected information (entropy) needed to classify a tuple in D:

G ain ( A )= Info ( D )− Info A ( D )

 Expected information (entropy) needed to classify a tuple in D:

 Information gained by branching on attribute A

G ain ( A )= Info ( D )− Info A ( D )

 Let X be a data sample (“evidence”): class label is unknown

 Informally, this can be viewed as

 Compute P(X|Ci) for each class >40

P(age = “<=30” | buys_computer = “yes”) = 2/9 = 0.222 <=30

P(income = “medium” | buys_computer = “no”) = 2/5 = 0.4

You might also like