How To Make The Best Use Of Live Sessions
• Please log in 10 mins before the class starts and check your internet connection to avoid any network issues during the LIVE
session
• All participants will be on mute, by default, to avoid any background noise. However, you will be unmuted by instructor if
required. Please use the “Questions” tab on your webinar tool to interact with the instructor at any point during the class
• Feel free to ask and answer questions to make your learning interactive. Instructor will address your queries at the end of on-
going topic
• Raise a ticket through your LMS in case of any queries. Our dedicated support team is available 24 x 7 for your assistance
• Your feedback is very much appreciated. Please share feedback after each class, which will help us enhance your learning
experience
Copyright © edureka and/or its affiliates. All rights reserved.
Course Outline
Introduction to Python Dimensionality Reduction
Sequences and File Operations Supervised Learning - II
Deep Dive-Functions, OOPS,
Modules, Errors and Exceptions Unsupervised Learning
Introduction to Numpy, Pandas Association Rules Mining and
and Matplotlib Recommendation Systems
Data Manipulation Reinforcement Learning
Introduction to Machine Learning
with Python Time Series Analysis
Supervised Learning - I Model Selection and Boosting
Copyright © edureka and/or its affiliates. All rights reserved.
Supervised Learning - II
Topics
The topics covered in this module are:
▪ Naïve Bayes Classifier
▪ Support Vector Machine(SVM)
▪ Hyperparameter Optimization
▪ Grid Search vs Random Search
Copyright © edureka and/or its affiliates. All rights reserved.
Objectives
After completing this module, you should be able to:
▪ Understand what is Naïve Bayes Classifier
▪ Naïve Bayes Classifier Steps
▪ Building Likelihood Tables
▪ Predicting the Output
▪ naiveBayes() in Python
▪ Support Vector Machine (SVM) Classifier
▪ Analyze how SVM works?
▪ Perform Hyperparameter Optimization
▪ Implement Grid Search vs Random Search
Copyright © edureka and/or its affiliates. All rights reserved.
Naïve Bayes Classifier
Copyright © edureka and/or its affiliates. All rights reserved.
Let’s understand Naïve Bayes classifier using
the same Use-Case
‘Game Decision Forecast using Weather Data’
Copyright © edureka and/or its affiliates. All rights reserved.
Naïve Bayes Classifier
It is a classification technique based on Bayes' Theorem with an assumption of independence among predictors.
In simple terms, a Naive Bayes classifier assumes that the presence of a particular feature in a class is unrelated
to the presence of any other feature.
Let’s find what
Likelihood Class Prior Probability these values
are
P(x|c)P(c)
P(c|x) =
P(x)
Predictor Prior Probability
Posterior Probability
Copyright © edureka and/or its affiliates. All rights reserved.
Naïve Bayes Classifier Steps
First we will create a frequency table using each attribute of the dataset,
Play
Frequency Table
Yes No
Sunny 2 3
Outlook Overcast 4 0
Rainy 3 2
Play
Frequency Table
Yes No
High 3 4
Humidity
Normal 6 1
Play
Frequency Table
Yes No
Strong 6 2
Wind
Weak 3 3
Copyright © edureka and/or its affiliates. All rights reserved.
Building Likelihood Tables
For each frequency table we will generate a likelihood table
P(x|c) = P(Sunny|Yes) = 2/9 = 0.22
Play
Likelihood Table
Yes No
Sunny 2/9 3/5 5/14 P(x) = P(Sunny) = 5/14 = 0.36
Outlook Overcast 4/9 0/5 4/14
Rainy 3/9 2/5 5/14
9/14 5/14
P(c) = P(Yes) = 9/14 = 0.64
Likelihood of ‘Yes’ given Sunny is
P(c|x) = P(Yes|Sunny) = P(Sunny|Yes)* P(Yes) / P(Sunny) = (0.22 x 0.64) /0.36 = 0.3911
Similarly Likelihood of ‘No’ given Sunny is
P(c|x) = P(No|Sunny) = P(Sunny|No)* P(No) / P(Sunny) = (0.6 x 0.36) /0.36 = 0.60
Copyright © edureka and/or its affiliates. All rights reserved.
Building Likelihood Tables
Similarly the ‘Likelihood Table’ of other attributes are
Likelihood table for Humidity Likelihood table for Wind
Play Play
Likelihood Table Likelihood Table
Yes No Yes No
High 3/9 4/5 7/14 Weak 6/9 2/5 8/14
Humidity Wind
Normal 6/9 1/5 7/14 Strong 3/9 3/5 6/14
9/14 5/14 9/14 5/14
P(Yes|High) = 0.33 x 0.6 / 0.5 = 0.42 P(Yes|Weak) = 0.67 x 0.64 / 0.57 = 0.75
P(No|High) = 0.8 x 0.36 / 0.5 = 0.58 P(No|Weak) = 0.4 x 0.36 / 0.57 = 0.25
Copyright © edureka and/or its affiliates. All rights reserved.
Predicting the Output
Suppose we have a day with the following values
Outlook = Rain
Humidity = High
Wind = Weak
Play = ?
Likelihood of ‘Yes’ on that Day = P(Outlook = Rain|Yes)*P(Humidity= High|Yes)* P(Wind= Weak|Yes)*P(Yes)
= 3/9 * 3/9 * 6/9 * 9/14 = 0.0476
Likelihood of ‘No’ on that Day = P(Outlook = Rain|No)*P(Humidity= High|No)* P(Wind= Weak|No)*P(No)
= 2/5 * 4/5 * 2/5 * 5/14 = 0.0166
Copyright © edureka and/or its affiliates. All rights reserved.
Predicting the Output
Now we normalize the values, then
Our model predicts that
there is a 74% chance
P(Yes) = 0.0476 / (0.0476 + 0.0166) = 0.74 there will be game
tomorrow
P(No) = 0.0166 / (0.0476 + 0.0166) = 0.26
Copyright © edureka and/or its affiliates. All rights reserved.
naiveBayes() in Python
To implement Naïve Bayes algorithm in Python, we will use the following library and function
from sklearn.naive bayes import GaussianNB Library
gnb = GaussianNB()
Function
y_pred_gnb = gnb.fit(X_train, y_train).predict(X_test)
Copyright © edureka and/or its affiliates. All rights reserved.
Use-Case 1
Copyright © edureka and/or its affiliates. All rights reserved.
Use-Case 1
As discussed earlier in Module 4, we have data about Hurricanes and Typhoons, from (1851-2014)
The data comprises of Location, wind, and pressure of tropical cyclones in Pacific Oceans
Based on the data we have to classify the storms into hurricanes, typhoons and their sub categories as per the
predefined classes mentioned ahead.
In this module we will implement Naïve Bayes and Random Forest.
Copyright © edureka and/or its affiliates. All rights reserved.
Predefined Class Description
1. TD – Tropical cyclone of tropical depression intensity (< 34 knots)
2. TS – Tropical cyclone of tropical storm intensity (34-63 knots)
3. HU – Tropical cyclone of hurricane intensity (> 64 knots)
4. EX – Extratropical cyclone (of any intensity)
5. SD – Subtropical cyclone of subtropical depression intensity (< 34 knots)
6. SS – Subtropical cyclone of subtropical storm intensity (> 34 knots)
7. LO – A low that is neither a tropical cyclone, a subtropical cyclone, nor an extratropical cyclone (of any
intensity) DB – Disturbance (of any intensity)
Copyright © edureka and/or its affiliates. All rights reserved.
Use-Case 1 Solution
The problem over here has seven predefined classes, As logistic regression is best suited for binary classification,
We will solve this problem with other classifiers and compare their output
1. Naïve Bayes
2. SVM
Copyright © edureka and/or its affiliates. All rights reserved.
Loading Necessary Libraries
We will load the necessary libraries as done earlier, using the code as shown below:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn import metrics
from sklearn import tree
Copyright © edureka and/or its affiliates. All rights reserved.
Data Import
You can download the dataset from LMS, than use the following code to load the data
data = pd.read_csv('pacific.csv')
print(data.head(6))
Output
Copyright © edureka and/or its affiliates. All rights reserved.
Data Manipulation
For categorical classification, we need the values to be categorical, so we will convert them as shown below:
data = pd.read_csv('pacific.csv')
print(data.head(6))
data.status = pd.categorical(data.status)
data['Status'] = data.status.cat.codes
Cat stands for categorical, each category has been assigned a number, let’s see how does it look now
Copyright © edureka and/or its affiliates. All rights reserved.
Plotting Typhon Class Frequency
To see the frequency of various categories, let’s create a frequency bar plot, use the code as shown below:
#lets count the frequency of different typhoons
sns.countplot(data['Status'], label = "Count")
plt.show()
Output
Copyright © edureka and/or its affiliates. All rights reserved.
Data Wrangling
We don’t need columns such as Name, Event, Latitude, Longitude to classify the data.
Hence we need to drop these columns from prediction variables as shown below:
pred_columns = data[:]
pred_columns.drop(['Status'], axis = 1, inplace = True)
pred_columns.drop(['Event'], axis = 1, inplace = True)
pred_columns.drop(['Latitude'], axis = 1, inplace = True)
pred_columns.drop(['Longitude'], axis = 1, inplace = True)
pred_columns.drop(['ID'], axis = 1, inplace = True)
pred_columns.drop(['Name'], axis = 1, inplace = True)
prediction_var = pred_columns.columns
print(list(prediction_var))
Output:
Copyright © edureka and/or its affiliates. All rights reserved.
Train-Test Split
Training and Testing partitions are used to provide:-
▪ Honest assessments of the performance of our predictive models
▪ Least amount of mathematical reasoning and manipulation of results
Scikit learn provides a function called train – test split to train and test data
train,test = train_test_split(data, test_size =0.3)#in
this our main data is splitted into train and test
#we can check their dimension
print(train.shape)
print(test.shape)
Output
Copyright © edureka and/or its affiliates. All rights reserved.
Creating Response and Target Variable
To create ‘Response’ and ‘Target’ variable, we will use the following code:
#taking the training data input
train_X = train[prediction_var]
train_y = train['Status']
print(list(train.columns))
#same for test
test_X = test[prediction_var]#taking test data inputs
test_y = test['Status'] #output value of test data
Copyright © edureka and/or its affiliates. All rights reserved.
Confusion Matrix
Let’s create a confusion matrix using the function below:
cnf_matrix_gnb = confusion_matrix(y_test, y_pred_gnb)
print(cnf_matrix_gnb)
That’s hard to
Output
interpret, let’s
calculate the
accuracy to
evaluate it
Copyright © edureka and/or its affiliates. All rights reserved.
Accuracy Prediction
To check model performance, we will see how inputs have been incorrectly classified using the code below:
print("Number of mislabeled points out of a total %d points: %d")
%(data.shape[0], (test_y ! = y_pred_gnb).sum())
Out of 26137 points, 7411 have been misclassified
Hence accuracy = (26137-7411)/26137 = 0.7164
Copyright © edureka and/or its affiliates. All rights reserved.
Support Vector Machine(SVM)
Copyright © edureka and/or its affiliates. All rights reserved.
Support Vector Machine
• Support Vector Machine(SVM) is a supervised machine learning algorithm which can be used for both
classification or regression challenges.
• It tries to define a hyperplane which can split the data in the most optimal way such that there is a wide
margin among the hyperplane and the observations.
Classifier Data
Line points
• It is one of the most efficient algorithm in Machine Learning.
Data
points
Copyright © edureka and/or its affiliates. All rights reserved.
What is Hyperplane
An hyperplane is a generalization of a plane.
➢ in one dimension, a hyperplane is called a point
➢ in two dimensions, it is a line
➢ in three dimensions, it is a plane
➢ in more dimensions you can call it an hyperplane
Example:
This point is the separating hyperplane in
one dimension.
Copyright © edureka and/or its affiliates. All rights reserved.
Support Vector Machine
An SVM model is a representation of the examples as points in space, mapped so that the examples of the
separate categories are divided by a clear gap that is as wide as possible.
New examples are then mapped into that same space and predicted to belong to a category based on which side
of the gap they fall. Separating
Margin Hyperplane
Support Vectors are simply the co-ordinates of individual
observation.
Support
Vectors
Copyright © edureka and/or its affiliates. All rights reserved.
How it works
Suppose we have two classes plotted as,
Just by looking at the
plot, we can see that it is
possible to separate the
Y-axis data using a straight line.
X-axis
Copyright © edureka and/or its affiliates. All rights reserved.
How it works
We can draw a separating lines as,
Multiple separating
lines can be drawn here
Y-axis
X-axis
Copyright © edureka and/or its affiliates. All rights reserved.
How it works
Purpose of SVM is to find optimal hyperplane,
We need to choose one
hyperplane which will
separate this data in an
Y-axis optimal way
X-axis
Copyright © edureka and/or its affiliates. All rights reserved.
How it works
If we choose the red hyperline, we can see that some of the observations will get misclassified. Intuitively, we
can see that if we select a hyperplane which is close to the data points of one class, then it might not generalize
well. So we will try to select
an hyperplane as far as
possible from data
points from each
Y-axis
category
X-axis
Copyright © edureka and/or its affiliates. All rights reserved.
How it works
This hyperplane will help us in real life data for perfect classification.
Now lets see how we
got the optimal
hyperplane
Y-axis
X-axis
Copyright © edureka and/or its affiliates. All rights reserved.
Choosing Optimal Hyperplane
Given a particular hyperplane, we can compute the distance between the hyperplane and the closest data point.
Once we have this
value, if we double it
we will get what is
Y-axis
called the margin.
Distance between hyperplane and
the closest point
X-axis
Copyright © edureka and/or its affiliates. All rights reserved.
Choosing Optimal Hyperplane
Basically the margin is a no man's land. There will never be any data point inside the margin
Similarly we will find
the margin for every
other hyperplanes
Y-axis
Distance between hyperplane and
the closest point
X-axis
Copyright © edureka and/or its affiliates. All rights reserved.
Choosing Optimal Hyperplane
After we find the margins of all hyperplanes, we will select the hyperplane having the largest margin as our
separating hyperplane
Here Margin 2 is
greater so will select
the purple line as our
Y-axis
hyperplane.
Optimal Hyperplane
Margin 1
Margin 2
X-axis
Copyright © edureka and/or its affiliates. All rights reserved.
svm() in Python
Copyright © edureka and/or its affiliates. All rights reserved.
svm() in Python
For implementing it we need to load the library using the code below:
from sklearn import svm Let’s understand what
Kernel, c and gamma
Syntax for support vector machine function is: values are
model = svm.svc(kernel='linear', c=1, gamma=1)
Copyright © edureka and/or its affiliates. All rights reserved.
Kernels in SVM
There are three types of kernel available in svm
Linear Polynomial RBF
When data is linearly When data is non linearly When data is non linearly
separable, we use linear separable, and can be separable, and cannot be
kernel. classified using a curve. classified using a curve.
We use polynomial kernel. We use rbf kernel.
Copyright © edureka and/or its affiliates. All rights reserved.
The ‘C’-Value
The ‘C’ value determines the width of the margin, larger the c value smaller is the margin.
‘C’ value directly affects the misclassification error
Recommended value of ‘C’ is between: 2-10 to 210
Copyright © edureka and/or its affiliates. All rights reserved.
The ‘Gamma’-Value
Gamma is the parameter of a Gaussian Kernel (to handle non-linear classification).
The data we have is not linearly separable in 2D so you want to transform
them to a higher dimension where they will be linearly separable.
Imagine "raising" the green points, then you can separate them from the
red points with a plane (hyperplane)
To "raise" the points you use the RBF kernel, gamma controls the shape of
the "peaks" where you raise the points.
Copyright © edureka and/or its affiliates. All rights reserved.
The ‘Gamma’-Value
A small gamma gives you a pointed bump in the higher dimensions, a
large gamma gives you a softer, broader bump.
So a small gamma will give you low bias and high variance while a large
gamma will give you higher bias and low variance.
Copyright © edureka and/or its affiliates. All rights reserved.
Hyperparameter Search
We now know that we can configure our model using the hyperparameters, how to choose their best fit value is
a tedious task if done manually, hence we use hyparameter search to do this task.
Hyperparameter search can be done in two ways:
1. Grid wise Search
2. Random Search
Copyright © edureka and/or its affiliates. All rights reserved.
Grid Wise Search
In Grid wise search the parameter values are equally distributed within the parameter range specified by the
user. Each value and it’s output are checked before dividing the final value.
For range -10 to 10, we can get grid search values as
-9,-7,-5,-3,-1,1,3,5,7,9 and so on
Copyright © edureka and/or its affiliates. All rights reserved.
Random Search
In random search method the value of the hyperparameter is randomly chosen within the range specified by the
user.
For range -10 to 10, we can get random value as
-9,+9,0,3,7,-8,5,-3,9 and so on
Copyright © edureka and/or its affiliates. All rights reserved.
SVM Model Building
Let’s code to do hyperparameter search and build the model using python
from sklearn.metrics import accuracy_score
from sklearn import svm
# To import the svm classifier
model = svm.SVC(kernel='linear')
model.fit(train_X,train_y)
#Predict Output
predicted= model.predict(test_X)
Copyright © edureka and/or its affiliates. All rights reserved.
Model Accuracy
To check model accuracy use the following code:
print("SVM accuray:",accuracy_score(test_y, predicted))
Copyright © edureka and/or its affiliates. All rights reserved.
Eager vs Lazy Learner
Eager Learner Lazy Learner
Generalized model from training dataset is
Training dataset is stored in system to build the model
constructed
On querying similarity between test data and training
Using the model the class of the test dataset is
set records is calculated to predict the class of test
predicted
data
Example – Decision Tree Example – K-nearest neighbour
Copyright © edureka and/or its affiliates. All rights reserved.
Summary
In this module, you should have learnt:
▪ What is Naïve Bayes Classifier
▪ How Naïve Bayes Classifier works?
▪ Support Vector Machine (SVM) Classifier
▪ How SVM works?
▪ Hyperparameter Optimization
▪ Grid Search vs Random Search
Copyright © edureka and/or its affiliates. All rights reserved.
Copyright
Copyright
© 2018,
© edureka and/or its affiliates. All rights reserved.
Copyright © edureka and/or its affiliates. All rights reserved.
Copyright © edureka and/or its affiliates. All rights reserved.