KEMBAR78
9 Supervised Learning - II | PDF | Support Vector Machine | Tropical Cyclones
0% found this document useful (0 votes)
72 views55 pages

9 Supervised Learning - II

Uploaded by

mahesh Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views55 pages

9 Supervised Learning - II

Uploaded by

mahesh Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

How To Make The Best Use Of Live Sessions

• Please log in 10 mins before the class starts and check your internet connection to avoid any network issues during the LIVE
session

• All participants will be on mute, by default, to avoid any background noise. However, you will be unmuted by instructor if
required. Please use the “Questions” tab on your webinar tool to interact with the instructor at any point during the class

• Feel free to ask and answer questions to make your learning interactive. Instructor will address your queries at the end of on-
going topic

• Raise a ticket through your LMS in case of any queries. Our dedicated support team is available 24 x 7 for your assistance

• Your feedback is very much appreciated. Please share feedback after each class, which will help us enhance your learning
experience

Copyright © edureka and/or its affiliates. All rights reserved.


Course Outline

Introduction to Python Dimensionality Reduction

Sequences and File Operations Supervised Learning - II

Deep Dive-Functions, OOPS,


Modules, Errors and Exceptions Unsupervised Learning

Introduction to Numpy, Pandas Association Rules Mining and


and Matplotlib Recommendation Systems

Data Manipulation Reinforcement Learning

Introduction to Machine Learning


with Python Time Series Analysis

Supervised Learning - I Model Selection and Boosting

Copyright © edureka and/or its affiliates. All rights reserved.


Supervised Learning - II
Topics
The topics covered in this module are:
▪ Naïve Bayes Classifier
▪ Support Vector Machine(SVM)
▪ Hyperparameter Optimization
▪ Grid Search vs Random Search

Copyright © edureka and/or its affiliates. All rights reserved.


Objectives
After completing this module, you should be able to:
▪ Understand what is Naïve Bayes Classifier
▪ Naïve Bayes Classifier Steps
▪ Building Likelihood Tables
▪ Predicting the Output
▪ naiveBayes() in Python

▪ Support Vector Machine (SVM) Classifier


▪ Analyze how SVM works?
▪ Perform Hyperparameter Optimization
▪ Implement Grid Search vs Random Search

Copyright © edureka and/or its affiliates. All rights reserved.


Naïve Bayes Classifier

Copyright © edureka and/or its affiliates. All rights reserved.


Let’s understand Naïve Bayes classifier using
the same Use-Case

‘Game Decision Forecast using Weather Data’

Copyright © edureka and/or its affiliates. All rights reserved.


Naïve Bayes Classifier
It is a classification technique based on Bayes' Theorem with an assumption of independence among predictors.

In simple terms, a Naive Bayes classifier assumes that the presence of a particular feature in a class is unrelated
to the presence of any other feature.

Let’s find what


Likelihood Class Prior Probability these values
are

P(x|c)P(c)
P(c|x) =
P(x)

Predictor Prior Probability


Posterior Probability

Copyright © edureka and/or its affiliates. All rights reserved.


Naïve Bayes Classifier Steps
First we will create a frequency table using each attribute of the dataset,
Play
Frequency Table
Yes No
Sunny 2 3
Outlook Overcast 4 0
Rainy 3 2

Play
Frequency Table
Yes No
High 3 4
Humidity
Normal 6 1

Play
Frequency Table
Yes No
Strong 6 2
Wind
Weak 3 3

Copyright © edureka and/or its affiliates. All rights reserved.


Building Likelihood Tables
For each frequency table we will generate a likelihood table
P(x|c) = P(Sunny|Yes) = 2/9 = 0.22
Play
Likelihood Table
Yes No
Sunny 2/9 3/5 5/14 P(x) = P(Sunny) = 5/14 = 0.36
Outlook Overcast 4/9 0/5 4/14
Rainy 3/9 2/5 5/14
9/14 5/14
P(c) = P(Yes) = 9/14 = 0.64

Likelihood of ‘Yes’ given Sunny is


P(c|x) = P(Yes|Sunny) = P(Sunny|Yes)* P(Yes) / P(Sunny) = (0.22 x 0.64) /0.36 = 0.3911

Similarly Likelihood of ‘No’ given Sunny is


P(c|x) = P(No|Sunny) = P(Sunny|No)* P(No) / P(Sunny) = (0.6 x 0.36) /0.36 = 0.60

Copyright © edureka and/or its affiliates. All rights reserved.


Building Likelihood Tables
Similarly the ‘Likelihood Table’ of other attributes are

Likelihood table for Humidity Likelihood table for Wind

Play Play
Likelihood Table Likelihood Table
Yes No Yes No
High 3/9 4/5 7/14 Weak 6/9 2/5 8/14
Humidity Wind
Normal 6/9 1/5 7/14 Strong 3/9 3/5 6/14
9/14 5/14 9/14 5/14

P(Yes|High) = 0.33 x 0.6 / 0.5 = 0.42 P(Yes|Weak) = 0.67 x 0.64 / 0.57 = 0.75

P(No|High) = 0.8 x 0.36 / 0.5 = 0.58 P(No|Weak) = 0.4 x 0.36 / 0.57 = 0.25

Copyright © edureka and/or its affiliates. All rights reserved.


Predicting the Output
Suppose we have a day with the following values
Outlook = Rain
Humidity = High
Wind = Weak
Play = ?

Likelihood of ‘Yes’ on that Day = P(Outlook = Rain|Yes)*P(Humidity= High|Yes)* P(Wind= Weak|Yes)*P(Yes)


= 3/9 * 3/9 * 6/9 * 9/14 = 0.0476

Likelihood of ‘No’ on that Day = P(Outlook = Rain|No)*P(Humidity= High|No)* P(Wind= Weak|No)*P(No)


= 2/5 * 4/5 * 2/5 * 5/14 = 0.0166

Copyright © edureka and/or its affiliates. All rights reserved.


Predicting the Output
Now we normalize the values, then
Our model predicts that
there is a 74% chance
P(Yes) = 0.0476 / (0.0476 + 0.0166) = 0.74 there will be game
tomorrow
P(No) = 0.0166 / (0.0476 + 0.0166) = 0.26

Copyright © edureka and/or its affiliates. All rights reserved.


naiveBayes() in Python
To implement Naïve Bayes algorithm in Python, we will use the following library and function

from sklearn.naive bayes import GaussianNB Library

gnb = GaussianNB()
Function
y_pred_gnb = gnb.fit(X_train, y_train).predict(X_test)

Copyright © edureka and/or its affiliates. All rights reserved.


Use-Case 1

Copyright © edureka and/or its affiliates. All rights reserved.


Use-Case 1
As discussed earlier in Module 4, we have data about Hurricanes and Typhoons, from (1851-2014)

The data comprises of Location, wind, and pressure of tropical cyclones in Pacific Oceans

Based on the data we have to classify the storms into hurricanes, typhoons and their sub categories as per the
predefined classes mentioned ahead.

In this module we will implement Naïve Bayes and Random Forest.

Copyright © edureka and/or its affiliates. All rights reserved.


Predefined Class Description
1. TD – Tropical cyclone of tropical depression intensity (< 34 knots)

2. TS – Tropical cyclone of tropical storm intensity (34-63 knots)

3. HU – Tropical cyclone of hurricane intensity (> 64 knots)

4. EX – Extratropical cyclone (of any intensity)

5. SD – Subtropical cyclone of subtropical depression intensity (< 34 knots)

6. SS – Subtropical cyclone of subtropical storm intensity (> 34 knots)

7. LO – A low that is neither a tropical cyclone, a subtropical cyclone, nor an extratropical cyclone (of any
intensity) DB – Disturbance (of any intensity)

Copyright © edureka and/or its affiliates. All rights reserved.


Use-Case 1 Solution
The problem over here has seven predefined classes, As logistic regression is best suited for binary classification,
We will solve this problem with other classifiers and compare their output

1. Naïve Bayes

2. SVM

Copyright © edureka and/or its affiliates. All rights reserved.


Loading Necessary Libraries
We will load the necessary libraries as done earlier, using the code as shown below:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn import metrics
from sklearn import tree

Copyright © edureka and/or its affiliates. All rights reserved.


Data Import
You can download the dataset from LMS, than use the following code to load the data

data = pd.read_csv('pacific.csv')
print(data.head(6))

Output

Copyright © edureka and/or its affiliates. All rights reserved.


Data Manipulation
For categorical classification, we need the values to be categorical, so we will convert them as shown below:
data = pd.read_csv('pacific.csv')
print(data.head(6))
data.status = pd.categorical(data.status)
data['Status'] = data.status.cat.codes
Cat stands for categorical, each category has been assigned a number, let’s see how does it look now

Copyright © edureka and/or its affiliates. All rights reserved.


Plotting Typhon Class Frequency
To see the frequency of various categories, let’s create a frequency bar plot, use the code as shown below:
#lets count the frequency of different typhoons
sns.countplot(data['Status'], label = "Count")
plt.show()

Output

Copyright © edureka and/or its affiliates. All rights reserved.


Data Wrangling
We don’t need columns such as Name, Event, Latitude, Longitude to classify the data.

Hence we need to drop these columns from prediction variables as shown below:

pred_columns = data[:]
pred_columns.drop(['Status'], axis = 1, inplace = True)
pred_columns.drop(['Event'], axis = 1, inplace = True)
pred_columns.drop(['Latitude'], axis = 1, inplace = True)
pred_columns.drop(['Longitude'], axis = 1, inplace = True)
pred_columns.drop(['ID'], axis = 1, inplace = True)
pred_columns.drop(['Name'], axis = 1, inplace = True)
prediction_var = pred_columns.columns
print(list(prediction_var))

Output:

Copyright © edureka and/or its affiliates. All rights reserved.


Train-Test Split
Training and Testing partitions are used to provide:-
▪ Honest assessments of the performance of our predictive models
▪ Least amount of mathematical reasoning and manipulation of results

Scikit learn provides a function called train – test split to train and test data

train,test = train_test_split(data, test_size =0.3)#in


this our main data is splitted into train and test
#we can check their dimension
print(train.shape)
print(test.shape)

Output

Copyright © edureka and/or its affiliates. All rights reserved.


Creating Response and Target Variable
To create ‘Response’ and ‘Target’ variable, we will use the following code:

#taking the training data input


train_X = train[prediction_var]
train_y = train['Status']
print(list(train.columns))

#same for test


test_X = test[prediction_var]#taking test data inputs
test_y = test['Status'] #output value of test data

Copyright © edureka and/or its affiliates. All rights reserved.


Confusion Matrix
Let’s create a confusion matrix using the function below:

cnf_matrix_gnb = confusion_matrix(y_test, y_pred_gnb)


print(cnf_matrix_gnb)

That’s hard to
Output
interpret, let’s
calculate the
accuracy to
evaluate it

Copyright © edureka and/or its affiliates. All rights reserved.


Accuracy Prediction
To check model performance, we will see how inputs have been incorrectly classified using the code below:

print("Number of mislabeled points out of a total %d points: %d")


%(data.shape[0], (test_y ! = y_pred_gnb).sum())

Out of 26137 points, 7411 have been misclassified

Hence accuracy = (26137-7411)/26137 = 0.7164

Copyright © edureka and/or its affiliates. All rights reserved.


Support Vector Machine(SVM)

Copyright © edureka and/or its affiliates. All rights reserved.


Support Vector Machine
• Support Vector Machine(SVM) is a supervised machine learning algorithm which can be used for both
classification or regression challenges.

• It tries to define a hyperplane which can split the data in the most optimal way such that there is a wide
margin among the hyperplane and the observations.
Classifier Data
Line points
• It is one of the most efficient algorithm in Machine Learning.

Data
points

Copyright © edureka and/or its affiliates. All rights reserved.


What is Hyperplane
An hyperplane is a generalization of a plane.
➢ in one dimension, a hyperplane is called a point
➢ in two dimensions, it is a line
➢ in three dimensions, it is a plane
➢ in more dimensions you can call it an hyperplane

Example:

This point is the separating hyperplane in


one dimension.

Copyright © edureka and/or its affiliates. All rights reserved.


Support Vector Machine
An SVM model is a representation of the examples as points in space, mapped so that the examples of the
separate categories are divided by a clear gap that is as wide as possible.

New examples are then mapped into that same space and predicted to belong to a category based on which side
of the gap they fall. Separating
Margin Hyperplane

Support Vectors are simply the co-ordinates of individual


observation.

Support
Vectors

Copyright © edureka and/or its affiliates. All rights reserved.


How it works
Suppose we have two classes plotted as,
Just by looking at the
plot, we can see that it is
possible to separate the
Y-axis data using a straight line.

X-axis

Copyright © edureka and/or its affiliates. All rights reserved.


How it works
We can draw a separating lines as,

Multiple separating
lines can be drawn here
Y-axis

X-axis

Copyright © edureka and/or its affiliates. All rights reserved.


How it works
Purpose of SVM is to find optimal hyperplane,
We need to choose one
hyperplane which will
separate this data in an
Y-axis optimal way

X-axis

Copyright © edureka and/or its affiliates. All rights reserved.


How it works
If we choose the red hyperline, we can see that some of the observations will get misclassified. Intuitively, we
can see that if we select a hyperplane which is close to the data points of one class, then it might not generalize
well. So we will try to select
an hyperplane as far as
possible from data
points from each
Y-axis

category

X-axis

Copyright © edureka and/or its affiliates. All rights reserved.


How it works
This hyperplane will help us in real life data for perfect classification.

Now lets see how we


got the optimal
hyperplane
Y-axis

X-axis

Copyright © edureka and/or its affiliates. All rights reserved.


Choosing Optimal Hyperplane
Given a particular hyperplane, we can compute the distance between the hyperplane and the closest data point.

Once we have this


value, if we double it
we will get what is
Y-axis

called the margin.

Distance between hyperplane and


the closest point

X-axis

Copyright © edureka and/or its affiliates. All rights reserved.


Choosing Optimal Hyperplane
Basically the margin is a no man's land. There will never be any data point inside the margin

Similarly we will find


the margin for every
other hyperplanes
Y-axis

Distance between hyperplane and


the closest point

X-axis

Copyright © edureka and/or its affiliates. All rights reserved.


Choosing Optimal Hyperplane
After we find the margins of all hyperplanes, we will select the hyperplane having the largest margin as our
separating hyperplane

Here Margin 2 is
greater so will select
the purple line as our
Y-axis

hyperplane.

Optimal Hyperplane

Margin 1

Margin 2
X-axis

Copyright © edureka and/or its affiliates. All rights reserved.


svm() in Python

Copyright © edureka and/or its affiliates. All rights reserved.


svm() in Python
For implementing it we need to load the library using the code below:

from sklearn import svm Let’s understand what


Kernel, c and gamma
Syntax for support vector machine function is: values are

model = svm.svc(kernel='linear', c=1, gamma=1)

Copyright © edureka and/or its affiliates. All rights reserved.


Kernels in SVM
There are three types of kernel available in svm

Linear Polynomial RBF

When data is linearly When data is non linearly When data is non linearly
separable, we use linear separable, and can be separable, and cannot be
kernel. classified using a curve. classified using a curve.
We use polynomial kernel. We use rbf kernel.

Copyright © edureka and/or its affiliates. All rights reserved.


The ‘C’-Value
The ‘C’ value determines the width of the margin, larger the c value smaller is the margin.

‘C’ value directly affects the misclassification error

Recommended value of ‘C’ is between: 2-10 to 210

Copyright © edureka and/or its affiliates. All rights reserved.


The ‘Gamma’-Value
Gamma is the parameter of a Gaussian Kernel (to handle non-linear classification).
The data we have is not linearly separable in 2D so you want to transform
them to a higher dimension where they will be linearly separable.

Imagine "raising" the green points, then you can separate them from the
red points with a plane (hyperplane)

To "raise" the points you use the RBF kernel, gamma controls the shape of
the "peaks" where you raise the points.

Copyright © edureka and/or its affiliates. All rights reserved.


The ‘Gamma’-Value
A small gamma gives you a pointed bump in the higher dimensions, a
large gamma gives you a softer, broader bump.

So a small gamma will give you low bias and high variance while a large
gamma will give you higher bias and low variance.

Copyright © edureka and/or its affiliates. All rights reserved.


Hyperparameter Search
We now know that we can configure our model using the hyperparameters, how to choose their best fit value is
a tedious task if done manually, hence we use hyparameter search to do this task.

Hyperparameter search can be done in two ways:

1. Grid wise Search

2. Random Search

Copyright © edureka and/or its affiliates. All rights reserved.


Grid Wise Search
In Grid wise search the parameter values are equally distributed within the parameter range specified by the
user. Each value and it’s output are checked before dividing the final value.

For range -10 to 10, we can get grid search values as


-9,-7,-5,-3,-1,1,3,5,7,9 and so on

Copyright © edureka and/or its affiliates. All rights reserved.


Random Search
In random search method the value of the hyperparameter is randomly chosen within the range specified by the
user.

For range -10 to 10, we can get random value as


-9,+9,0,3,7,-8,5,-3,9 and so on

Copyright © edureka and/or its affiliates. All rights reserved.


SVM Model Building
Let’s code to do hyperparameter search and build the model using python

from sklearn.metrics import accuracy_score


from sklearn import svm
# To import the svm classifier

model = svm.SVC(kernel='linear')
model.fit(train_X,train_y)
#Predict Output
predicted= model.predict(test_X)

Copyright © edureka and/or its affiliates. All rights reserved.


Model Accuracy
To check model accuracy use the following code:

print("SVM accuray:",accuracy_score(test_y, predicted))

Copyright © edureka and/or its affiliates. All rights reserved.


Eager vs Lazy Learner

Eager Learner Lazy Learner

Generalized model from training dataset is


Training dataset is stored in system to build the model
constructed

On querying similarity between test data and training


Using the model the class of the test dataset is
set records is calculated to predict the class of test
predicted
data

Example – Decision Tree Example – K-nearest neighbour

Copyright © edureka and/or its affiliates. All rights reserved.


Summary
In this module, you should have learnt:
▪ What is Naïve Bayes Classifier
▪ How Naïve Bayes Classifier works?
▪ Support Vector Machine (SVM) Classifier
▪ How SVM works?
▪ Hyperparameter Optimization
▪ Grid Search vs Random Search

Copyright © edureka and/or its affiliates. All rights reserved.


Copyright
Copyright
© 2018,
© edureka and/or its affiliates. All rights reserved.
Copyright © edureka and/or its affiliates. All rights reserved.
Copyright © edureka and/or its affiliates. All rights reserved.

You might also like