0% found this document useful (0 votes)

73 views5 pages

Case Study - Classifier

1. The document describes a case study comparing different machine learning classifiers - SVM, KNN, K-means clustering, and decision trees - on a bill authentication dataset. 2. The SVM, KNN, and K-means classifiers are implemented on the dataset, with the KNN and K-means classifiers achieving 100% accuracy on the test data. 3. The document discusses preprocessing steps like feature selection and train-test splitting, and evaluates the different classifier performances using classification reports.

Uploaded by

Stuti Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

73 views5 pages

Case Study - Classifier

Uploaded by

Stuti Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

10/16/2020 Case Study - Classifier - Colaboratory

NAME - AJINKYA KSHIRSAGAR

PRN - 19030141005

COURSE - Applied Data Analytics with Python

SEM - 3

Assignment No: 3

Title of the Assignment:-

Case Study : Machine Learning : Classi ers

Due tomorrow at 11:59 PM

Instructions

1. Select any data set of your own choice

2. Use following classi ers for the data prediction a. SVM b. K-NN c. K-means clustering d. Decision Tree classi er
3. Do the comparative study and discuss the predictions from these classi ers.

CASE STUDY

Choosing the right estimator for Machine Learning

INTRODUCTION

Machine learning (ML) is the study of computer algorithms that improve automatically through experience. It is seen as a subset of arti cial
intelligence.

Machine learning algorithms build a mathematical model based on sample data, known as "training data", in order to make predictions or
decisions without being explicitly programmed to do so.

Machine learning algorithms are used in a wide variety of applications, such as email ltering and computer vision, where it is di cult or
infeasible to develop conventional algorithms to perform the needed tasks.

Example
A learning problem considers a set of n samples of data and then tries to predict properties of unknown data. If each sample is more than a
single number and, for instance, a multi-dimensional entry (aka multivariate data), it is said to have several attributes or features.

Learning problems fall into a few categories:

1. Supervised learning
The in which the data comes with additional attributes that we want to predict .

This problem can be either:

Classi cation:

If thesamples belong to two or more classes and we want to learn from already labeled data how to predict the class of unlabeled data. An
example of a classi cation problem would be handwritten digit recognition, in which the aim is to assign each input vector to one of a nite
number of discrete categories. Another way to think of classi cation is as a discrete (as opposed to continuous) form of supervised learning
where one has a limited number of categories and for each of the n samples provided, one is to try to label them with the correct category or
class.

Regression:

If the desired output consists of one or more continuous variables, then the task is called regression. An example of a regression problem
would be the prediction of the length of a salmon as a function of its age and weight.

2. Unsupervised learning
In which the training data consists of a set of input vectors x without any corresponding target values.

The goal in such problems

Clustering:

To discover groups of similar examples within the data, where it is called clustering

Density estimation:

To determine the distribution of data within the input space, known as density estimation

Other ==> High-dimensional space:

To project the data from a high-dimensional space down to two or three dimensions for the purpose of visualization

CHOOSE RIGHT ESTIMATOR

https://colab.research.google.com/drive/14fe7516SkWr_pLHGviNJ546-HlCcNK5o?authuser=1#scrollTo=1PXQI5cYxQYp&printMode=true 1/5
10/16/2020 Case Study - Classifier - Colaboratory

IMPEMENTATION PART -- DATASET USED - Bill Authentication

PREPROCESSING PART

1. Importing Libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

2. Importing Dataset

DF= pd.read_csv("/content/bill_authentication.csv")

DF.head()

Variance Skewness Curtosis Entropy V1 V2 Class

0 3.62160 8.6661 -2.8073 -0.44699 2.072345 -3.241693 0

1 4.54590 8.1674 -2.4586 -1.46210 17.936710 15.784810 0

2 3.86600 -2.6383 1.9242 0.10645 1.083576 7.319176 0

3 3.45660 9.5228 -4.0112 -3.59440 11.120670 14.406780 0

4 0.32924 -4.4552 4.5718 -0.98880 23.711550 2.557729 0

3. Target & Predicted Variable

A = DF.drop('Class', axis=1)
B = A.drop('V1', axis=1)

X = B.drop('V2', axis=1)
Y = DF['Class']

4. Splitting in Train & Test

from sklearn.model_selection import train_test_split

X_train, x_test, Y_train, y_test = train_test_split(X, Y, test_size = 0.20)

CLASSIFIER PART

1. Support Vector Machine(SVM) Classi er

from sklearn.svm import SVC

SVM_Classifier = SVC(kernel='linear')
SVM_Classifier.fit(X_train, Y_train)

SVC(C=1.0, break_ties=False, cache_size=200, class_weight=None, coef0=0.0,

decision_function_shape='ovr', degree=3, gamma='scale', kernel='linear',
max_iter=-1, probability=False, random_state=None, shrinking=True,
tol=0.001, verbose=False)

y_pred = SVM_Classifier.predict(x_test)

from sklearn.metrics import classification_report, confusion_matrix

print('Support Vector Machine : \n',classification report(y test,y pred))
https://colab.research.google.com/drive/14fe7516SkWr_pLHGviNJ546-HlCcNK5o?authuser=1#scrollTo=1PXQI5cYxQYp&printMode=true 2/5
10/16/2020 Case Study - Classifier - Colaboratory
print( Support Vector Machine : \n ,classification_report(y_test,y_pred))

Support Vector Machine :

precision recall f1-score support

0 0.97 1.00 0.99 149

1 1.00 0.97 0.98 126

accuracy 0.99 275

macro avg 0.99 0.98 0.99 275
weighted avg 0.99 0.99 0.99 275

2. K-Nearest Neighbour(KNN) Classi er

from sklearn.neighbors import KNeighborsClassifier

KNN_Classifier = KNeighborsClassifier(n_neighbors = 1)
KNN_Classifier.fit(X_train, Y_train)

KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',

metric_params=None, n_jobs=None, n_neighbors=1, p=2,
weights='uniform')

y_pred = KNN_Classifier.predict(x_test)

from sklearn.metrics import classification_report, confusion_matrix

print('K-Nearest Neighbour : \n',classification_report(y_test,y_pred))

K-Nearest Neighbour :
precision recall f1-score support

0 1.00 1.00 1.00 149

1 1.00 1.00 1.00 126

accuracy 1.00 275

macro avg 1.00 1.00 1.00 275
weighted avg 1.00 1.00 1.00 275

3. K-Means Classi er

f1 = DF['V1'].values
f2 = DF['V2'].values
z = np.array(list(zip(f1, f2)))
plt.scatter(f1, f2, c='black', s=7)

<matplotlib.collections.PathCollection at 0x7f6db52cfc18>

def dist(a, b, ax=1):

return np.linalg.norm(a - b, axis=ax)

k = 2

C_x = np.random.randint(0, np.max(z)-20, size=k)

C_y = np.random.randint(0, np.max(z)-20, size=k)

C = np.array(list(zip(C_x, C_y)), dtype=np.float32)

print("Initial Centroids")
print(C)

Initial Centroids
[[11. 34.]
[36. 47.]]

plt.scatter(f1, f2, c='#050505', s=7)

plt.scatter(C_x, C_y, marker='*', s=600, c='g')

<matplotlib.collections.PathCollection at 0x7f6db4ae14a8>

kmeans=KMeans(n_clusters=2)

kmeans.fit(X)

KMeans(algorithm='auto', copy_x=True, init='k-means++', max_iter=300,

n_clusters=2, n_init=10, n_jobs=None, precompute_distances='auto',
random_state=None, tol=0.0001, verbose=0)

from sklearn.metrics import classification_report

print('K-Means : \n',classification_report(y_test,y_pred))

https://colab.research.google.com/drive/14fe7516SkWr_pLHGviNJ546-HlCcNK5o?authuser=1#scrollTo=1PXQI5cYxQYp&printMode=true 3/5
10/16/2020 Case Study - Classifier - Colaboratory

K-Means :
precision recall f1-score support

0 1.00 1.00 1.00 149

1 1.00 1.00 1.00 126

accuracy 1.00 275

macro avg 1.00 1.00 1.00 275
weighted avg 1.00 1.00 1.00 275

4. Decision Tree Classi er

from sklearn import tree

Decision_Tree_Classifier = tree.DecisionTreeClassifier()
Decision_Tree_Classifier = Decision_Tree_Classifier.fit(X, Y)

tree.plot_tree(Decision_Tree_Classifier)

[Text(164.27686567164181, 203.85, 'X[0] <= 0.32\ngini = 0.494\nsamples = 1372\nvalue = [762, 610]'),

Text(107.4358208955224, 176.67000000000002, 'X[1] <= 7.565\ngini = 0.306\nsamples = 657\nvalue = [124, 533]'),
Text(74.95522388059702, 149.49, 'X[0] <= -0.403\ngini = 0.131\nsamples = 552\nvalue = [39, 513]'),
Text(39.97611940298508, 122.31, 'X[2] <= 6.219\ngini = 0.07\nsamples = 471\nvalue = [17, 454]'),
Text(19.98805970149254, 95.13, 'X[1] <= 7.293\ngini = 0.006\nsamples = 324\nvalue = [1, 323]'),
Text(9.99402985074627, 67.94999999999999, 'gini = 0.0\nsamples = 320\nvalue = [0, 320]'),
Text(29.982089552238808, 67.94999999999999, 'X[1] <= 7.349\ngini = 0.375\nsamples = 4\nvalue = [1, 3]'),
Text(19.98805970149254, 40.77000000000001, 'gini = 0.0\nsamples = 1\nvalue = [1, 0]'),
Text(39.97611940298508, 40.77000000000001, 'gini = 0.0\nsamples = 3\nvalue = [0, 3]'),
Text(59.964179104477616, 95.13, 'X[1] <= -4.675\ngini = 0.194\nsamples = 147\nvalue = [16, 131]'),
Text(49.97014925373135, 67.94999999999999, 'gini = 0.0\nsamples = 130\nvalue = [0, 130]'),
Text(69.95820895522388, 67.94999999999999, 'X[1] <= -2.962\ngini = 0.111\nsamples = 17\nvalue = [16, 1]'),
Text(59.964179104477616, 40.77000000000001, 'X[1] <= -4.581\ngini = 0.5\nsamples = 2\nvalue = [1, 1]'),
Text(49.97014925373135, 13.590000000000003, 'gini = 0.0\nsamples = 1\nvalue = [1, 0]'),
Text(69.95820895522388, 13.590000000000003, 'gini = 0.0\nsamples = 1\nvalue = [0, 1]'),
Text(79.95223880597015, 40.77000000000001, 'gini = 0.0\nsamples = 15\nvalue = [15, 0]'),
Text(109.93432835820896, 122.31, 'X[1] <= 5.454\ngini = 0.396\nsamples = 81\nvalue = [22, 59]'),
Text(99.9402985074627, 95.13, 'X[2] <= 2.625\ngini = 0.265\nsamples = 70\nvalue = [11, 59]'),
Text(89.94626865671643, 67.94999999999999, 'gini = 0.0\nsamples = 58\nvalue = [0, 58]'),
Text(109.93432835820896, 67.94999999999999, 'X[3] <= 1.228\ngini = 0.153\nsamples = 12\nvalue = [11, 1]'),
Text(99.9402985074627, 40.77000000000001, 'gini = 0.0\nsamples = 11\nvalue = [11, 0]'),
Text(119.92835820895523, 40.77000000000001, 'gini = 0.0\nsamples = 1\nvalue = [0, 1]'),
Text(119.92835820895523, 95.13, 'gini = 0.0\nsamples = 11\nvalue = [11, 0]'),
Text(139.91641791044776, 149.49, 'X[0] <= -4.726\ngini = 0.308\nsamples = 105\nvalue = [85, 20]'),
Text(129.9223880597015, 122.31, 'gini = 0.0\nsamples = 20\nvalue = [0, 20]'),
Text(149.91044776119404, 122.31, 'gini = 0.0\nsamples = 85\nvalue = [85, 0]'),
Text(221.1179104477612, 176.67000000000002, 'X[2] <= -4.386\ngini = 0.192\nsamples = 715\nvalue = [638, 77]'),
Text(179.89253731343285, 149.49, 'X[1] <= 7.192\ngini = 0.363\nsamples = 42\nvalue = [10, 32]'),
Text(169.89850746268658, 122.31, 'gini = 0.0\nsamples = 32\nvalue = [0, 32]'),
Text(189.88656716417913, 122.31, 'gini = 0.0\nsamples = 10\nvalue = [10, 0]'),
Text(262.34328358208955, 149.49, 'X[0] <= 1.592\ngini = 0.125\nsamples = 673\nvalue = [628, 45]'),
Text(209.87462686567164, 122.31, 'X[2] <= -2.272\ngini = 0.352\nsamples = 184\nvalue = [142, 42]'),
Text(184.88955223880598, 95.13, 'X[1] <= 5.667\ngini = 0.198\nsamples = 27\nvalue = [3, 24]'),
Text(174.8955223880597, 67.94999999999999, 'gini = 0.0\nsamples = 24\nvalue = [0, 24]'),
Text(194.88358208955225, 67.94999999999999, 'gini = 0.0\nsamples = 3\nvalue = [3, 0]'),
Text(234.85970149253734, 95.13, 'X[3] <= 0.082\ngini = 0.203\nsamples = 157\nvalue = [139, 18]'),
Text(214.8716417910448, 67.94999999999999, 'X[0] <= 0.42\ngini = 0.017\nsamples = 120\nvalue = [119, 1]'),
Text(204.87761194029852, 40.77000000000001, 'X[2] <= -1.324\ngini = 0.111\nsamples = 17\nvalue = [16, 1]'),
Text(194.88358208955225, 13.590000000000003, 'gini = 0.0\nsamples = 1\nvalue = [0, 1]'),
Text(214.8716417910448, 13.590000000000003, 'gini = 0.0\nsamples = 16\nvalue = [16, 0]'),
Text(224.86567164179107, 40.77000000000001, 'gini = 0.0\nsamples = 103\nvalue = [103, 0]'),
Text(254.84776119402986, 67.94999999999999, 'X[2] <= 1.853\ngini = 0.497\nsamples = 37\nvalue = [20, 17]'),
Text(244.85373134328358, 40.77000000000001, 'X[1] <= 3.559\ngini = 0.188\nsamples = 19\nvalue = [2, 17]'),
Text(234.85970149253734, 13.590000000000003, 'gini = 0.0\nsamples = 17\nvalue = [0, 17]'),
Text(254.84776119402986, 13.590000000000003, 'gini = 0.0\nsamples = 2\nvalue = [2, 0]'),
Text(264.84179104477613, 40.77000000000001, 'gini = 0.0\nsamples = 18\nvalue = [18, 0]'),
Text(314.81194029850747, 122.31, 'X[0] <= 2.037\ngini = 0.012\nsamples = 489\nvalue = [486, 3]'),
Text(304.8179104477612, 95.13, 'X[2] <= -2.648\ngini = 0.101\nsamples = 56\nvalue = [53, 3]'),
Text(294.8238805970149, 67.94999999999999, 'X[3] <= -1.796\ngini = 0.375\nsamples = 4\nvalue = [1, 3]'),
Text(284.8298507462687, 40.77000000000001, 'gini = 0.0\nsamples = 1\nvalue = [1, 0]'),
Text(304.8179104477612, 40.77000000000001, 'gini = 0.0\nsamples = 3\nvalue = [0, 3]'),
Text(314.81194029850747, 67.94999999999999, 'gini = 0.0\nsamples = 52\nvalue = [52, 0]'),
Text(324.80597014925377, 95.13, 'gini = 0.0\nsamples = 433\nvalue = [433, 0]')]

tree.plot_tree(Decision_Tree_Classifier)
plt.savefig('DTImage')

y_pred = Decision_Tree_Classifier.predict(x_test)

from sklearn.metrics import classification_report, confusion_matrix

print('Decision Tree : \n',classification_report(y_test,y_pred))

Decision Tree :
precision recall f1-score support

0 1.00 1.00 1.00 149

1 1.00 1.00 1.00 126

accuracy 1.00 275

macro avg 1.00 1.00 1.00 275
weighted avg 1.00 1.00 1.00 275

https://colab.research.google.com/drive/14fe7516SkWr_pLHGviNJ546-HlCcNK5o?authuser=1#scrollTo=1PXQI5cYxQYp&printMode=true 4/5
10/16/2020 Case Study - Classifier - Colaboratory

COMPARISION ANALYSIS

https://colab.research.google.com/drive/14fe7516SkWr_pLHGviNJ546-HlCcNK5o?authuser=1#scrollTo=1PXQI5cYxQYp&printMode=true 5/5

Machine Learning Assignment
No ratings yet
Machine Learning Assignment
7 pages
Artificial Intelligence Advance Practical
No ratings yet
Artificial Intelligence Advance Practical
12 pages
Act 8
No ratings yet
Act 8
20 pages
Advance AI and ML LAB
No ratings yet
Advance AI and ML LAB
16 pages
EX - NO:3: Algorithm
No ratings yet
EX - NO:3: Algorithm
11 pages
Unit2 ML Programs
No ratings yet
Unit2 ML Programs
7 pages
Python For Data Science IA 1 Programs
No ratings yet
Python For Data Science IA 1 Programs
14 pages
Bi 6 New
No ratings yet
Bi 6 New
6 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
44 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
20 pages
Naive Bayes Algorithm
No ratings yet
Naive Bayes Algorithm
51 pages
Python For Data Science IA 1 Programs
No ratings yet
Python For Data Science IA 1 Programs
14 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
8 pages
LAB-4 Report
No ratings yet
LAB-4 Report
21 pages
Aml Lab
No ratings yet
Aml Lab
6 pages
KNN Classification Lab Guide
No ratings yet
KNN Classification Lab Guide
4 pages
Machine Learning Lab New
No ratings yet
Machine Learning Lab New
14 pages
ML Practical Kiranjot 6-10
No ratings yet
ML Practical Kiranjot 6-10
10 pages
5 Markd
No ratings yet
5 Markd
24 pages
DM Assignment 2
No ratings yet
DM Assignment 2
23 pages
Big Data Practical
No ratings yet
Big Data Practical
20 pages
Beginner's Guide To Implementing A Simple Machine Learning Project - DeV Community
No ratings yet
Beginner's Guide To Implementing A Simple Machine Learning Project - DeV Community
9 pages
Machine Learning Evaluation Guide
100% (1)
Machine Learning Evaluation Guide
504 pages
ML Cheatsheet
No ratings yet
ML Cheatsheet
4 pages
Professional Machine Learning
No ratings yet
Professional Machine Learning
67 pages
CP4252 Machine Learning Lab Manual
100% (1)
CP4252 Machine Learning Lab Manual
33 pages
Artificial Intelligence Lab 7
No ratings yet
Artificial Intelligence Lab 7
10 pages
20MEMECH Part 3 - Classification
No ratings yet
20MEMECH Part 3 - Classification
49 pages
Slide 2 ML Basics
No ratings yet
Slide 2 ML Basics
42 pages
Scikit Learn Cheat Sheet Python
No ratings yet
Scikit Learn Cheat Sheet Python
1 page
MLT 1 - 7 Kanish
No ratings yet
MLT 1 - 7 Kanish
24 pages
Machine Learning II
No ratings yet
Machine Learning II
61 pages
Classification Review
No ratings yet
Classification Review
8 pages
CS178 Homework #1: Problem 0: Getting Connected
No ratings yet
CS178 Homework #1: Problem 0: Getting Connected
4 pages
Machine Learning Cheatsheet
No ratings yet
Machine Learning Cheatsheet
5 pages
Scikit-Learn Python Cheat Sheet
100% (1)
Scikit-Learn Python Cheat Sheet
1 page
Shubham Pract 6 - Merged
No ratings yet
Shubham Pract 6 - Merged
12 pages
Machine Learning LAB
No ratings yet
Machine Learning LAB
20 pages
Python ML Lab for Beginners
No ratings yet
Python ML Lab for Beginners
10 pages
Classification FoundationalMathofAI S24
No ratings yet
Classification FoundationalMathofAI S24
6 pages
Scikit-Learn Python Cheat Sheet
100% (1)
Scikit-Learn Python Cheat Sheet
1 page
ML Lab 146
No ratings yet
ML Lab 146
50 pages
Prathamesh KRAI
No ratings yet
Prathamesh KRAI
38 pages
ML Lab-1
No ratings yet
ML Lab-1
32 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
105 pages
Machine Learning Basics for Beginners
No ratings yet
Machine Learning Basics for Beginners
14 pages
Data Preprocessing
No ratings yet
Data Preprocessing
9 pages
Lab On ML Print-Set-2022
No ratings yet
Lab On ML Print-Set-2022
10 pages
ML Internal Answers
No ratings yet
ML Internal Answers
9 pages
SVM K NN MLP With Sklearn Jupyter NoteBo
No ratings yet
SVM K NN MLP With Sklearn Jupyter NoteBo
22 pages
Total Listing Machine Learning
100% (1)
Total Listing Machine Learning
114 pages
Shobit Sharma (2124399) ML Lab File PDF
No ratings yet
Shobit Sharma (2124399) ML Lab File PDF
19 pages
Python ML Algorithms Guide
No ratings yet
Python ML Algorithms Guide
7 pages
KNN Final
No ratings yet
KNN Final
4 pages
ICICI PO English Paper
No ratings yet
ICICI PO English Paper
5 pages
Supervised Learning Unsupervised Learning
No ratings yet
Supervised Learning Unsupervised Learning
1 page
SBI SO Vacancy
No ratings yet
SBI SO Vacancy
46 pages
Job Opening
No ratings yet
Job Opening
1 page
Fruit Data: Apples, Oranges, Lemons
No ratings yet
Fruit Data: Apples, Oranges, Lemons
2 pages
SS3 Introduction To Database Systems
No ratings yet
SS3 Introduction To Database Systems
30 pages
Data Normalization Guide
No ratings yet
Data Normalization Guide
11 pages
ERD More Normalisation Example
No ratings yet
ERD More Normalisation Example
11 pages
Specific Speed
No ratings yet
Specific Speed
10 pages
Community Copy - Epic Legacy Tome of Titans - Vol. 2
91% (11)
Community Copy - Epic Legacy Tome of Titans - Vol. 2
499 pages
It Modern App Guide
No ratings yet
It Modern App Guide
40 pages
Corporate Strategy and Planning - Timothy Mahea
83% (6)
Corporate Strategy and Planning - Timothy Mahea
224 pages
Architecture Cover Letter Issuu
100% (1)
Architecture Cover Letter Issuu
4 pages
Sustainability 13 07042 v2
No ratings yet
Sustainability 13 07042 v2
12 pages
Jurnal Geologi Struktur
No ratings yet
Jurnal Geologi Struktur
29 pages
Weeks 1 To 4 Fundamental Analysis
No ratings yet
Weeks 1 To 4 Fundamental Analysis
166 pages
Augie The Green Knight PDF
No ratings yet
Augie The Green Knight PDF
230 pages
History 2ND Year
100% (1)
History 2ND Year
4 pages
Production Tank: Verified Fire Water Demand Calculation
No ratings yet
Production Tank: Verified Fire Water Demand Calculation
6 pages
Class12 CS Project Hospital Management System Bhavan New (1) Button
No ratings yet
Class12 CS Project Hospital Management System Bhavan New (1) Button
24 pages
Introduction to CBT: Concepts & Techniques
No ratings yet
Introduction to CBT: Concepts & Techniques
3 pages
Akash Padhiyar Profile
No ratings yet
Akash Padhiyar Profile
2 pages
Test 2 Jan 2022
No ratings yet
Test 2 Jan 2022
3 pages
Public Members Why Do We Use Properties Rather Than Public
No ratings yet
Public Members Why Do We Use Properties Rather Than Public
52 pages
HRM Strategies and Models Overview
No ratings yet
HRM Strategies and Models Overview
38 pages
Solving Problems Involving Loans
No ratings yet
Solving Problems Involving Loans
13 pages
Group 10 - Uber Strategic Alliances
No ratings yet
Group 10 - Uber Strategic Alliances
10 pages
Natural Heritage
No ratings yet
Natural Heritage
14 pages
Patrick Nguyen - Key Selection Criteria
No ratings yet
Patrick Nguyen - Key Selection Criteria
2 pages
Bimaks Water Treatment Catalog 1718370506
No ratings yet
Bimaks Water Treatment Catalog 1718370506
28 pages
Admit Card
No ratings yet
Admit Card
1 page
Fish Innards Dehydrator Design & Fabrication
No ratings yet
Fish Innards Dehydrator Design & Fabrication
10 pages
Chapter 3 Unemployment & Inflation
No ratings yet
Chapter 3 Unemployment & Inflation
14 pages
PP Sap Table
100% (1)
PP Sap Table
4 pages
DBMS Lab Practical Guide
No ratings yet
DBMS Lab Practical Guide
25 pages
The First Season Recap, Episode by Episode: Pilot
No ratings yet
The First Season Recap, Episode by Episode: Pilot
45 pages
Medical Biller Practice Test, Medical Billing Practice Test
No ratings yet
Medical Biller Practice Test, Medical Billing Practice Test
7 pages
The Art of Startup Fundraising 1st Edition Alejandro Cremades Newest Edition 2025
100% (4)
The Art of Startup Fundraising 1st Edition Alejandro Cremades Newest Edition 2025
155 pages

Case Study - Classifier

Uploaded by

Case Study - Classifier

Uploaded by

10/16/2020 Case Study - Classifier - Colaboratory

NAME - AJINKYA KSHIRSAGAR

COURSE - Applied Data Analytics with Python

Title of the Assignment:-

Case Study : Machine Learning : Classi ers

Due tomorrow at 11:59 PM

1. Select any data set of your own choice

Choosing the right estimator for Machine Learning

Learning problems fall into a few categories:

This problem can be either:

The goal in such problems

Other ==> High-dimensional space:

CHOOSE RIGHT ESTIMATOR

IMPEMENTATION PART -- DATASET USED - Bill Authentication

Variance Skewness Curtosis Entropy V1 V2 Class

0 3.62160 8.6661 -2.8073 -0.44699 2.072345 -3.241693 0

1 4.54590 8.1674 -2.4586 -1.46210 17.936710 15.784810 0

2 3.86600 -2.6383 1.9242 0.10645 1.083576 7.319176 0

3 3.45660 9.5228 -4.0112 -3.59440 11.120670 14.406780 0

4 0.32924 -4.4552 4.5718 -0.98880 23.711550 2.557729 0

3. Target & Predicted Variable

4. Splitting in Train & Test

from sklearn.model_selection import train_test_split

1. Support Vector Machine(SVM) Classi er

from sklearn.svm import SVC

SVC(C=1.0, break_ties=False, cache_size=200, class_weight=None, coef0=0.0,

from sklearn.metrics import classification_report, confusion_matrix

Support Vector Machine :

0 0.97 1.00 0.99 149

accuracy 0.99 275

2. K-Nearest Neighbour(KNN) Classi er

from sklearn.neighbors import KNeighborsClassifier

KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',

from sklearn.metrics import classification_report, confusion_matrix

0 1.00 1.00 1.00 149

accuracy 1.00 275

def dist(a, b, ax=1):

C_x = np.random.randint(0, np.max(z)-20, size=k)

C_y = np.random.randint(0, np.max(z)-20, size=k)

C = np.array(list(zip(C_x, C_y)), dtype=np.float32)

plt.scatter(f1, f2, c='#050505', s=7)

KMeans(algorithm='auto', copy_x=True, init='k-means++', max_iter=300,

from sklearn.metrics import classification_report

0 1.00 1.00 1.00 149

accuracy 1.00 275

4. Decision Tree Classi er

from sklearn import tree

[Text(164.27686567164181, 203.85, 'X[0] <= 0.32\ngini = 0.494\nsamples = 1372\nvalue = [762, 610]'),

from sklearn.metrics import classification_report, confusion_matrix

0 1.00 1.00 1.00 149

accuracy 1.00 275

You might also like