0% found this document useful (0 votes)

20 views30 pages

Supervised Learning

The document provides an overview of supervised learning, focusing on classification tasks in machine learning. It outlines key elements of a learning task, types of classification problems, and various classification algorithms such as k-NN, Naïve Bayes, SVM, Decision Trees, and Random Forests. Additionally, it discusses evaluation metrics for classification performance and applications and limitations of classification models.

Uploaded by

mwascoder

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views30 pages

Supervised Learning

Uploaded by

mwascoder

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Murang’a University of Technology

Innovation for Prosperity

Lecture 2

Supervised Learning
- Classification
Elements a Learning Task
Together, these elements frame the scope, input, and evaluation of a machine
learning task. Three key elements that define a learning task in machine learning are:

1. Task (T)
• This defines what the machine learning model is expected to accomplish.
• Examples include classification, regression, clustering, or reinforcement learning
tasks.

2. Experience (E)
• This refers to the data or interaction the model uses to learn.
• For supervised learning, the experience involves labeled datasets with input-
output pairs. In unsupervised learning, the experience comes from unlabeled data
patterns. Reinforcement learning draws experience from agent-environment
interactions and feedback (rewards).

3
Elements a Learning Task
3. Performance Measure (P)
– This quantifies how well the model is achieving the task.
– Common metrics include accuracy, precision, recall, F1-score for
classification, mean squared error (MSE) for regression, or cumulative
reward for reinforcement learning.
– The performance measure evaluates the model's output on unseen
test data to ensure it generalizes well.

4
Introduction to Supervised Learning

• Supervised learning is a type of machine learning where the model

learns from labeled data.
• The goal is to map the input to the output and predict the labels of
unseen data accurately.
• Supervised Learning presents two types of problems; Classification
and Regression.
How It Works:
1. Input Data: Contains both features (independent variables) and labels
(dependent variables).
2. Learning Phase: The model identifies patterns in the data that map
inputs to outputs.
3. Prediction Phase: For new data, the model predicts the label using the
learned patterns.
5
Introduction to Classification
• Classification is a supervised learning task where the model assigns
a category or label to an input based on its features.
• It deals with discrete outputs, such as "yes" or "no," "cat" or "dog,"
or multiple classes like "setosa," "versicolor," and "virginica" in the
Iris dataset.

Key Terms in Classification:

• Classes: Categories or labels (e.g., spam/not spam).
• Features: Attributes used to classify the input (e.g., word
frequencies in an email).
• Decision Boundary: The boundary that separates different classes
in the feature space.

6
Types of Classification Tasks
1. Binary Classification:
• Two possible classes (e.g., spam vs. not spam).

2. Multiclass Classification:
• More than two classes (e.g., classifying images as "cat," "dog," or
"bird").

3. Multilabel Classification:
• Each instance can belong to multiple classes simultaneously (e.g.,
tagging a movie with genres like "action," "comedy," and "thriller").

7
Types of Classification Algorithms
1. k-Nearest Neighbors (k-NN)
• The k-NN algorithm is one of the simplest yet powerful classification algorithms. It
classifies data points based on the majority class among their nearest neighbors.

How It Works:
• Compute the distance (e.g., Euclidean, Manhattan) between the input data point
and all other points in the training set.
• Identify the k nearest neighbors to the data point.
• Assign the class that is most common among these k neighbors.
Strengths:
• Simple to implement and understand.
• Works well with small datasets and non-linear decision boundaries.
Weaknesses:
• Computationally expensive for large datasets.
• Sensitive to irrelevant or redundant features.

8
k-Nearest Neighbors (k-NN)

9
2. Naïve Bayes
• Naïve Bayes is based on Bayes' Theorem, which calculates the probability of a class
given a set of features.
• It assumes that the features are conditionally independent of each other, which
may not always be true in practice but works surprisingly well for many problems.
How It Works:
• For each class, calculate the likelihood of the data belonging to that class
based on the feature probabilities.
• Multiply these probabilities and apply Bayes’ Theorem to calculate the
posterior probability.
• Assign the class with the highest posterior probability.
Strengths:
• Extremely fast and efficient for high-dimensional data.
• Performs well on text classification problems.
Weaknesses:
• Assumes independence among features, which may not always hold true.
• Performs poorly if features are highly correlated.
10
Naïve Bayes

11
3. Support Vector Machines (SVM)

• SVM is a robust and versatile classification algorithm that works by finding the
hyperplane that best separates the data points of different classes.
• Depending on the type of data, there are two types of Support Vector
Machines:

Linear SVM or Simple SVM,

• is used for data that is linearly separable. A dataset is termed linearly
separable data if it can be classified into two classes using a single straight line.

Nonlinear SVM or Kernel SVM,

• is a type of SVM that is used to classify nonlinearly separated data, or data
that cannot be classified using a straight line. It has more flexibility for
nonlinear data because more features can be added to fit a hyperplane
instead of a two-dimensional space.

12
Support Vector Machines (SVM)
How SVM Works
• Separate Classes: SVM finds the best hyperplane that divides
data into distinct classes.
• Maximize Margin: It ensures the margin (distance) between the
hyperplane and the nearest data points (support vectors) is as
large as possible.
• Kernel Trick: For non-linear data, SVM transforms the data into a
higher-dimensional space using kernel functions to make it
separable.
• Support Vectors: The data points closest to the hyperplane are
called support vectors, which define the decision boundary.

13
Support Vector Machines (SVM)

14
4. Decision Trees
• Decision trees use a tree-like structure where internal nodes represent feature-
based decisions, and leaf nodes represent class labels.

How It Works:
• At each node, the algorithm selects the feature that best splits the data into
pure subsets (e.g., using metrics like Gini impurity or information gain).
• This process continues recursively until all data points are classified.
Strengths:
• Easy to interpret and visualize.
• Handles both numerical and categorical data.
Weaknesses:
• Prone to overfitting, especially for deep trees.
• Sensitive to small changes in data.

15
Decision Trees

16
5. Random Forests
• Random Forests address the limitations of decision trees by creating an ensemble
of trees and averaging their predictions.
How It Works:
• Generates multiple decision trees using bootstrap samples of the training
data.
• At each split, a random subset of features is considered to ensure diversity
among the trees.
• Final prediction is made by majority voting or averaging.
Strengths:
• Reduces overfitting compared to a single decision tree.
• Robust to noisy data and outliers.
Weaknesses:
• Can be computationally expensive for large datasets.
• Less interpretable than a single decision tree.

17
Random Forests

18
Evaluating Classification Performance

• Evaluating the performance of a classification model is crucial

for understanding how well it predicts the target variable.
• Evaluation metrics for classification tasks help us understand
how good machine learning models are by giving us valuable
information about different aspects of model performance.
• This information helps us choose, improve, and use these
models effectively.
• The following are some common metrics and techniques
used:
➢ Confusion Matrix, Accuracy, Recall, Precision and F1 Score

19
Confusion Matrix
• A confusion matrix is a table that summarizes the classification results
and indicates the number of true positive, true negative, false positive,
and false negative results.
• It provides a clear summary of predictions versus actual class labels, which
offers insights into the model’s accuracy and misclassifications.

20
Confusion Matrix

21
Accuracy Metric
• The accuracy score represents the percentage of correct predictions
in the overall test data.
• A high accuracy score indicates that the model is making a large
proportion of correct predictions, while a low accuracy score
indicates that the model is making too many incorrect predictions.

22
Recall Metric
• Recall provides the accuracy for individual classes.
• It is a crucial metric for evaluating model performance.
• Recall measures the proportion of true positives among all
actual positive instances.

23
Precision Metric
• Precision measures the proportion of true positives (correctly
classified positive cases) out of all cases classified as positive.
• Precision tells us how often the model’s positive predictions
are correct, highlighting the accuracy of its relevant
predictions.

24
F1-Score
• F1-score calculates the harmonic mean of recall and precision,
representing a balanced measure of model performance.
• F1-score will only be good, when your Recall and Precision value is
good.
• The F1 score combines precision and recall to produce a single
score that is the harmonic average of the two metrics.

25
Non-Parametric Models
• Non-parametric models do not make strong assumptions
about the data distribution and can have a flexible number of
parameters that can grow with the data.
• They are often more flexible but can be computationally more
expensive.
Examples: k-NN, Support Vector Machines (SVM), Decision
Trees, and Random Forests are non-parametric.
Strengths: Can capture complex relationships in data without
assuming a specific functional form.
Weaknesses: Require large amounts of data to generalize
effectively.

26
Non-Parametric Models
• Non-parametric methods make minimal assumptions about the
data compared to parametric methods. However, they still rely on
some key assumptions to function effectively.
• Here are three assumptions typically associated with non-
parametric methods:
– Independence: Data points are independent and not influenced
by others.
– Random Sampling: Data represents a random sample from the
population.
– Homogeneity of Measurement: Measurements are consistent
across all data points.

27
Applications of Classification
Classification models are widely used in diverse fields, offering solutions to
real-world problems:

i. Healthcare: Disease diagnosis (e.g., cancer detection using image

classification).
ii. Finance: Fraud detection in credit card transactions.
iii. Marketing: Customer segmentation (e.g., classifying customers based on
purchasing behavior).
iv. Natural Language Processing (NLP): Email spam detection.
v. Image Recognition: Object detection in autonomous vehicles.
vi. Cybersecurity: Intrusion detection in networks.
vii. Education: Plagiarism detection using text classification techniques.

28
Limitations of Classification
Data Dependency:
• Requires labeled data, which can be expensive and time-consuming to obtain.
Overfitting:
• Complex models may overfit the training data, leading to poor generalization.
Imbalanced Data:
• Models struggle when one class dominates the dataset (e.g., fraud detection).
Computational Cost:
• Some algorithms can be computationally expensive for large datasets.
Interpretability:
• Advanced models (e.g., Neural Networks) are "black boxes," making them
hard to explain.

29
Class Activity
1. Implement a Support Vector Machine (SVM) classifier on the
Iris dataset. Use a linear kernel and split the data into training
and testing sets with a test size of 0.2 and random state=42.
Calculate and print the accuracy of your model on the test set.
(7 Marks)
2. Using the breast cancer dataset from scikit-learn, implement a
binary classification model using any classifier covered in this
lecture. Print the following evaluation metrics for your model's
performance on the test set: Confusion Matrix, Accuracy,
Precision, Recall and F1-score. (13 Marks)

Spam Not Spam
No ratings yet
Spam Not Spam
7 pages
BSC ML CH1
No ratings yet
BSC ML CH1
63 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
24 pages
Classification in Machine Learning
No ratings yet
Classification in Machine Learning
15 pages
Category AI Model
No ratings yet
Category AI Model
7 pages
Unit 3
No ratings yet
Unit 3
123 pages
Unit 3 Ds
No ratings yet
Unit 3 Ds
10 pages
ML - ML in Nutshell
No ratings yet
ML - ML in Nutshell
7 pages
Amlt Bca Unit-1
No ratings yet
Amlt Bca Unit-1
24 pages
4.0 Supervised Learning 4.1 Discuss Classification Model
No ratings yet
4.0 Supervised Learning 4.1 Discuss Classification Model
48 pages
Supervised Learning - SVM - DT
No ratings yet
Supervised Learning - SVM - DT
43 pages
Machine Learning Classification Guide
No ratings yet
Machine Learning Classification Guide
28 pages
Machine Learning Crash Course: Computer Vision James Hays
No ratings yet
Machine Learning Crash Course: Computer Vision James Hays
38 pages
Chatgpt Unit - 3
No ratings yet
Chatgpt Unit - 3
4 pages
CH 5
No ratings yet
CH 5
19 pages
U21amg05 Aif and ML Unit 04 Notes
No ratings yet
U21amg05 Aif and ML Unit 04 Notes
42 pages
7 Classification Algorithms in Python
No ratings yet
7 Classification Algorithms in Python
9 pages
11 W11NSE6220 - Fall 2023 - Zeng
No ratings yet
11 W11NSE6220 - Fall 2023 - Zeng
43 pages
ML and DL
No ratings yet
ML and DL
15 pages
CH 7
No ratings yet
CH 7
33 pages
CS585 Lecture October03rd
No ratings yet
CS585 Lecture October03rd
146 pages
ML Overview
No ratings yet
ML Overview
11 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
5 pages
Unit 2
No ratings yet
Unit 2
11 pages
INT524 Unit3
No ratings yet
INT524 Unit3
35 pages
Classification FoundationalMathofAI S24
No ratings yet
Classification FoundationalMathofAI S24
6 pages
ML
No ratings yet
ML
18 pages
Assessing A Single Classification Algorithm and Two Classification Algorithms
No ratings yet
Assessing A Single Classification Algorithm and Two Classification Algorithms
12 pages
ML Unit4
No ratings yet
ML Unit4
10 pages
Chapter 2,3,4
No ratings yet
Chapter 2,3,4
8 pages
Machine Learning Algorithms Laiki
No ratings yet
Machine Learning Algorithms Laiki
123 pages
Introduction To AI
No ratings yet
Introduction To AI
51 pages
Lec 17 - Dsfa23
No ratings yet
Lec 17 - Dsfa23
32 pages
CLASSIFICATION
No ratings yet
CLASSIFICATION
36 pages
Unit 3
No ratings yet
Unit 3
27 pages
Machine Learning Algorithms Guide
No ratings yet
Machine Learning Algorithms Guide
10 pages
AI For Eng Supervised-Learning
No ratings yet
AI For Eng Supervised-Learning
25 pages
Lecture03. Classification (Chapter 3)
No ratings yet
Lecture03. Classification (Chapter 3)
46 pages
ML Algorithms Comprehensive Study
No ratings yet
ML Algorithms Comprehensive Study
9 pages
ML Notes - 2025
No ratings yet
ML Notes - 2025
145 pages
ML Unit-Ii
No ratings yet
ML Unit-Ii
37 pages
Machine Learning for Data Analysts
No ratings yet
Machine Learning for Data Analysts
31 pages
Machine Learning
No ratings yet
Machine Learning
15 pages
Chapter 2 Supervised Learning - p1-2
No ratings yet
Chapter 2 Supervised Learning - p1-2
45 pages
Unit 4 Classification
No ratings yet
Unit 4 Classification
87 pages
Machine Learning
No ratings yet
Machine Learning
38 pages
Assignment 0.2
No ratings yet
Assignment 0.2
8 pages
Machine Learning Classifiers Guide
No ratings yet
Machine Learning Classifiers Guide
111 pages
Classification
No ratings yet
Classification
4 pages
Pattern Recognition
No ratings yet
Pattern Recognition
33 pages
Chapter 4. Classification Algorithms-Stud
No ratings yet
Chapter 4. Classification Algorithms-Stud
43 pages
Classification
No ratings yet
Classification
23 pages
Practical # 11
No ratings yet
Practical # 11
10 pages
13 PracticalMachineLearning
100% (1)
13 PracticalMachineLearning
84 pages
ML Models
No ratings yet
ML Models
21 pages
Data Science Lecture: Classification & Regression
No ratings yet
Data Science Lecture: Classification & Regression
27 pages
CH-5 ML
No ratings yet
CH-5 ML
36 pages
AIO2023
No ratings yet
AIO2023
11 pages
Compute2
No ratings yet
Compute2
10 pages
Techniques of Cluster Analysis: A Seminar On
No ratings yet
Techniques of Cluster Analysis: A Seminar On
25 pages
DIANA Clustering with R Guide
No ratings yet
DIANA Clustering with R Guide
4 pages
Machine Learning Deep Learning
No ratings yet
Machine Learning Deep Learning
2 pages
PRACTICAL FILE FML - Jatin
No ratings yet
PRACTICAL FILE FML - Jatin
15 pages
CSE1015 - Machine Learning Essentials: J Component Report
No ratings yet
CSE1015 - Machine Learning Essentials: J Component Report
18 pages
KNN: Panduan untuk Mahasiswa AI
No ratings yet
KNN: Panduan untuk Mahasiswa AI
9 pages
UAS Mechine Learning
No ratings yet
UAS Mechine Learning
5 pages
Lec 06 Clustering
No ratings yet
Lec 06 Clustering
44 pages
Box and Whisker Plot
No ratings yet
Box and Whisker Plot
2 pages
Predictive Accuracy: A Misleading Performance Measure For Highly Imbalanced Data
No ratings yet
Predictive Accuracy: A Misleading Performance Measure For Highly Imbalanced Data
12 pages
Lesson 4.1 - Unsupervised Learning Partitioning Methods PDF
No ratings yet
Lesson 4.1 - Unsupervised Learning Partitioning Methods PDF
41 pages
DBSCAN Algorithm for Data Scientists
No ratings yet
DBSCAN Algorithm for Data Scientists
10 pages
Qualitative vs Quantitative Variables
No ratings yet
Qualitative vs Quantitative Variables
1 page
Spriiprad - Machine Learning Model Basics Intermediate
No ratings yet
Spriiprad - Machine Learning Model Basics Intermediate
2 pages
Deep Learning Techniques For Cyber Security Intrusion Detection: A Detailed Analysis
No ratings yet
Deep Learning Techniques For Cyber Security Intrusion Detection: A Detailed Analysis
11 pages
Six Sigma Process Analysis Tool
No ratings yet
Six Sigma Process Analysis Tool
12 pages
Lecture 2 Classifier Performance Metrics
No ratings yet
Lecture 2 Classifier Performance Metrics
60 pages
Hierarchical Clustering Basics
No ratings yet
Hierarchical Clustering Basics
2 pages
ML Unit 3
No ratings yet
ML Unit 3
10 pages
Outlier Ensembles An Introduction 1st Edition Charu C. Aggarwal Instant Download
No ratings yet
Outlier Ensembles An Introduction 1st Edition Charu C. Aggarwal Instant Download
151 pages
Credit Card Fraud Detection Algorithms
No ratings yet
Credit Card Fraud Detection Algorithms
7 pages
Correlation and Regression
100% (1)
Correlation and Regression
100 pages
Machine Learning Dec 2023
No ratings yet
Machine Learning Dec 2023
1 page
Machine Learning Classification Guide
No ratings yet
Machine Learning Classification Guide
42 pages
30 Assignments PDF
No ratings yet
30 Assignments PDF
5 pages
Seminar Maschinellem Lernen: An Improved Model Selection Heuristic For AUC
No ratings yet
Seminar Maschinellem Lernen: An Improved Model Selection Heuristic For AUC
19 pages
Data Mining Exam Prep Guide
No ratings yet
Data Mining Exam Prep Guide
4 pages
Test Bank For College Algebra Graphs and Models 5th Edition by Bittinger ISBN 0321783956 9780321783950 Instant Download
No ratings yet
Test Bank For College Algebra Graphs and Models 5th Edition by Bittinger ISBN 0321783956 9780321783950 Instant Download
115 pages

Supervised Learning

Uploaded by

Supervised Learning

Uploaded by

Murang’a University of Technology

Innovation for Prosperity

• Supervised learning is a type of machine learning where the model

Key Terms in Classification:

Linear SVM or Simple SVM,

Nonlinear SVM or Kernel SVM,

• Evaluating the performance of a classification model is crucial

i. Healthcare: Disease diagnosis (e.g., cancer detection using image

You might also like