KEMBAR78
Ad3461 ML Lab Manual Format Edited | PDF | Bayesian Network | Accuracy And Precision
0% found this document useful (0 votes)
19 views45 pages

Ad3461 ML Lab Manual Format Edited

The document outlines the implementation of various machine learning algorithms including Candidate Elimination, Decision Tree (ID3), Artificial Neural Networks using Back Propagation, and Naive Bayesian Classifier, all using Python scripts. Each section includes the aim, algorithm steps, program code, and results of the implementations. The document serves as a lab manual for machine learning practices at SRM Institute of Science and Technology.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views45 pages

Ad3461 ML Lab Manual Format Edited

The document outlines the implementation of various machine learning algorithms including Candidate Elimination, Decision Tree (ID3), Artificial Neural Networks using Back Propagation, and Naive Bayesian Classifier, all using Python scripts. Each section includes the aim, algorithm steps, program code, and results of the implementations. The document serves as a lab manual for machine learning practices at SRM Institute of Science and Technology.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

lOMoARcPSD|47101843

AD3461 ML Lab Manual format edited

Machine Learning Lab (SRM Institute of Science and Technology)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university


Downloaded by Sharmi Rks (sharmirks74@gmail.com)
lOMoARcPSD|47101843

Ex No: 1 IMPLEMENTATION OF CANDIDATE ELIMINATION ALGORITHM

AIM:

To implement Candidate Elimination Algorithm using python script.

ALGORITHM:

Step 1: Initialize the version space.

● Initialize the most general hypothesis (h_G) to the maximally general hypothesis
(all attributes set to '?').

● Initialize the most specific hypothesis (h_S) to the maximally specific hypothesis
(all attributes set to specific values or 'null' if not possible).

Step 2: Iterate through the training examples.

● For each positive example, update h_G and h_S as follows:

● For each attribute that does not match the positive example, make it
more specific in h_G.

● For each attribute that matches the positive example, make it more
specific in h_S.

● For each negative example, update h_G and h_S as follows:

● For each attribute that does not match the negative example, make it
more specific in h_S.

● For each attribute that matches the negative example, make it more
specific in h_G.
Step 3: Refine the version space.

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

● Remove any hypothesis from the version space that is more general than
another hypothesis or more specific than another hypothesis.

Step 4: Repeat Steps 2 and 3 until convergence.

● Keep iterating through the training examples and refining the version space
until it becomes consistent, i.e., contains only one specific hypothesis that
correctly classifies all the training examples.

Step 5: Output the final hypothesis.

PROGRAM:

import numpy as np

import pandas as pd

data = pd.DataFrame(data=pd.read_csv('finds1.csv'))

concepts = np.array(data.iloc[:,0:-1])

target = np.array(data.iloc[:,-1])

def learn(concepts, target):

specific_h = concepts[0].copy()

print("initialization of specific_h and general_h")

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

print(specific_h)

general_h = [["?" for i in range(len(specific_h))] for i in range(len(specific_h))]

print(general_h)

for i, h in enumerate(concepts):

if target[i] == "Yes":

for x in range(len(specific_h)):

if h[x] != specific_h[x]:

specific_h[x] = '?'

general_h[x][x] = '?'

if target[i] == "No":

for x in range(len(specific_h)):

if h[x] != specific_h[x]:

general_h[x][x] = specific_h[x]

else:

general_h[x][x] = '?'

print(" steps of Candidate Elimination Algorithm",i+1)

print("Specific_h ",i+1,"\n ")

print(specific_h)

print("general_h ", i+1, "\n ")

print(general_h)

indices = [i for i, val in enumerate(general_h) if val == ['?', '?', '?', '?', '?', '?']]

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

for i in indices:

general_h.remove(['?', '?', '?', '?', '?', '?'])

return specific_h, general_hs_final, g_final = learn(concepts, target)

print("Final Specific_h:", s_final, sep="\n")

print("Final General_h:", g_final, sep="\n")

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

OUTPUT:

initialization of specific_h and general_h

['Cloudy' 'Cold' 'High' 'Strong' 'Warm' 'Change']

[['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?',

'?', '?', '?'], ['?', '?', '?', '?', '?', '?']]

steps of Candidate Elimination Algorithm 8

Specific_h 8

['?' '?' '?' 'Strong' '?' '?']

general_h 8

[['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', 'Strong', '?', '?'], ['?',

'?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?']]

Final Specific_h:

['?' '?' '?' 'Strong' '?' '?']

Final General_h:

[['?', '?', '?', 'Strong', '?', '?']]

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

RESULT:

Thus the implementation candidate - Elimination algorithm has been implemented


successfully

Ex.No: 2 IMPLEMENTATION OF DECISION TREE BASED ID3 ALGORITHM

AIM:

To implement Decision Tree Based ID3 Algorithm using python script.

ALGORITHM:

Step 1: Start the program

Step 2: Load the dataset and organize it into a table, with rows representing instances and
columns representing features. The last column should contain the class labels.

Step 3: Define a function to calculate the entropy of the dataset. Entropy measures the
uncertainty in the dataset based on class distribution.

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

Step 4: For each feature, calculate the information gain. Information gain measures how
much a feature contributes to reducing the uncertainty in the dataset.

Step 5: Select the feature with the highest information gain as the best feature to split the
dataset.

Step 6: Divide the dataset into subsets based on the values of the best feature found in
Step 4.

Step 7: Repeat Recursively

Step 8: Build the decision tree by assigning the best feature as the splitting criterion at
each internal node and the majority class as the class label for each leaf node.

Step 9: Use the created decision tree to classify new instances by traversing the tree from
the root to the appropriate leaf node based on their feature values.

Step 10: Evaluate the Model

Step 11: Stop the program

PROGRAM:

import pandas as pd

import numpy as np

dataset=pd.read_csv('playtennis.csv',names=['outlook','temperature','humidity','wind','class',])

def entropy(target_col):

elements,counts = np.unique(target_col,return_counts = True)

entropy=np.sum([(counts[i]/np.sum(counts))*np.log2(counts[i]/np.sum(counts)) for i in

range(len(elements))])

return entropy

def InfoGain(data,split_attribute_name,target_name="class"):

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

total_entropy = entropy(data[target_name])

vals,counts= np.unique(data[split_attribute_name],return_counts=True)

Weighted_Entropy=np.sum([counts[i]/np.sum(counts))*entropy(data.where(data[split_attrib
ute_name]==vals[i].dropna()[target_name]) for i in range(len(vals))])

Information_Gain = total_entropy - Weighted_Entropy

return Information_Gain

def ID3(data,originaldata,features,target_attribute_name="class",

parent_node_class = None):

if len(np.unique(data[target_attribute_name])) <= 1:

return np.unique(data[target_attribute_name])[0]

elif len(data)==0:

returnnp.unique(originaldata[target_attribute_name])
[np.argmax(np.uniqe(originaldata[target_attribute_name],return_counts=True)[1])]

elif len(features) ==0:

return parent_node_class

else:

parent_node_classnp.unique(data[target_attribute_name])
[np.argmax(np.unique(data[target_attribute_name],return_counts=True)[1])]

item_values = [InfoGain(data,feature,target_attribute_name) for feature in features]


#Return the information gain values for the features in the dataset

best_feature_index = np.argmax(item_values)

best_feature = features[best_feature_index]

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

tree = {best_feature:{}}

features = [i for i in features if i != best_feature]

for value in np.unique(data[best_feature]):

value = value

sub_data = data.where(data[best_feature] == value).dropna()

subtree=ID3(sub_data,dataset,features,target_attribute_name,parent_node_class)

tree[best_feature][value] = subtree

return(tree)

tree = ID3(dataset,dataset,dataset.columns[:-1])

print(' \nDisplay Tree\n',tree)

OUTPUT:

Display Tree

{'outlook': {'Overcast': 'Yes', 'Rain': {'wind': {'Strong': 'No', 'Weak': 'Yes'}}, 'Sunny':

{'humidity': {'High': 'No', 'Normal': 'Yes'}}}}

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

10

RESULT:

Thus the implementation candidate - Elimination algorithm has been implemented successfully

EX NO.3 IMPLEMENTATION OF ARTIFICIAL NEURAL NETWORK USING BACK


PROPAGATION ALGORITHM

AIM:

To implement Artificial Neural Network using back Propagation Algorithm using python
script.

ALGORITHM:

Step 1: Inputs X, arrive through the preconnected path.

Step 2: The input is modeled using true weights W. Weights are usually chosen randomly.

Step 3: Calculate the output of each neuron from the input layer to the hidden layer to the
output layer.

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

11

Step 4: Calculate the error in the outputs

Step 5: From the output layer, go back to the hidden layer to adjust the weights to reduce
the error.

Step 6: Repeat the process until the desired output is achieved.

PROGRAM:

import numpy as np

X = np.array(([2, 9], [1, 5], [3, 6]), dtype=float)

y = np.array(([92], [86], [89]), dtype=float)

X = X/np.amax(X,axis=0) # maximum of X array longitudinally y = y/100

#Sigmoid Function

def sigmoid (x):

return (1/(1 + np.exp(-x)))

#Derivative of Sigmoid Function

def derivatives_sigmoid(x):

return x * (1 - x)

#Variable initialization

epoch=7000 #Setting training iterations

lr=0.1 #Setting learning rate

inputlayer_neurons = 2 #number of features in data set

hiddenlayer_neurons = 3 #number of hidden layers neurons

output_neurons = 1 #number of neurons at output layer

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

12

#weight and bias initialization

wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neurons))

bh=np.random.uniform(size=(1,hiddenlayer_neurons))

wout=np.random.uniform(size=(hiddenlayer_neurons,output_neurons))

bout=np.random.uniform(size=(1,output_neurons))

# draws a random range of numbers uniformly of dim x*y

#Forward Propagation

for i in range(epoch):

hinp1=np.dot(X,wh)

hinp=hinp1 + bh

hlayer_act = sigmoid(hinp)

outinp1=np.dot(hlayer_act,wout)

outinp= outinp1+ bout

output = sigmoid(outinp)

#Backpropagation

EO = y-output

outgrad = derivatives_sigmoid(output)

d_output = EO* outgrad

EH = d_output.dot(wout.T)

hiddengrad = derivatives_sigmoid(hlayer_act)

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

13

#how much hidden layer wts contributed to error

d_hiddenlayer = EH * hiddengrad

wout += hlayer_act.T.dot(d_output) *lr

# dotproduct of nextlayererror and currentlayerop

bout += np.sum(d_output, axis=0,keepdims=True) *lr

wh += X.T.dot(d_hiddenlayer) *lr

#bh += np.sum(d_hiddenlayer, axis=0,keepdims=True) *lr

print("Input: \n" + str(X))

print("Actual Output: \n" + str(y))

print("Predicted Output: \n" ,output)

OUTPUT:

Input:

[[ 0.66666667 1. ]

[ 0.33333333 0.55555556]

[ 1. 0.66666667]]

Actual Output:

[[ 0.92]

[ 0.86]

[ 0.89]]

Predicted Output:

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

14

[[ 0.89559591]

[ 0.88142069]

[ 0.8928407 ]]

RESULT:
Thus the implementation of back propagation algorithm has been done successfully.

EX.NO 4: IMPLEMENTATION OF NAIVE BAYESIAN CLASSIFIER

AIM:

To implement Naïve Bayesian Classifier using python script.

ALGORITHM:

Step 1: Data Pre-processing step

Step 2: Fitting Naive Bayes to the Training set

Step 3: Predicting the test result

Step 4: Test accuracy of the result(Creation of Confusion matrix)

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

15

Step 5: Visualizing the test set result.

PROGRAM:

import pandas as pd

msg=pd.read_csv('naivetext1.csv',names=['message','label'])

print('The dimensions of the dataset',msg.shape)

msg['labelnum']=msg.label.map({'pos':1,'neg':0})

X=msg.message

y=msg.labelnum

print(X)

print(y)

from sklearn.model_selection import train_test_split

xtrain,xtest,ytrain,ytest=train_test_split(X,y)

print(xtest.shape)

print(xtrain.shape)

print(ytest.shape)

print(ytrain.shape)

from sklearn.feature_extraction.text import CountVectorizer

count_vect = CountVectorizer()

xtrain_dtm = count_vect.fit_transform(xtrain)

xtest_dtm=count_vect.transform(xtest)

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

16

from sklearn.naive_bayes import MultinomialNB

clf = MultinomialNB().fit(xtrain_dtm,ytrain)

predicted = clf.predict(xtest_dtm)

from sklearn import metrics

print('Accuracy metrics')

print('Accuracy of the classifer is',metrics.accuracy_score(ytest,predicted))

print('Confusion matrix')

print(metrics.confusion_matrix(ytest,predicted))

print('Recall and Precison ')

print(metrics.recall_score(ytest,predicted))

print(metrics.precision_score(ytest,predicted))

OUTPUT:

The dimensions of the dataset (18, 2)

0 I love this sandwich

1 This is an amazing place

2 I feel very good about these beers

3 This is my best work

4 What an awesome view

5 I do not like this restaurant

6 I am tired of this stuff

7 I can't deal with this

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

17

8 He is my sworn enemy

9 My boss is horrible

10 This is an awesome place

11 I do not like the taste of this juice

12 I love to dance

13 I am sick and tired of this place

14 What a great holiday

15 That is a bad locality to stay

16 We will have good fun tomorrow

17 I went to my enemy's house today

Name: message, dtype: object

01

11

21

31

41

50

60

70

80

90

10 1

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

18

11 0

12 1

13 0

14 1

15 0

16 1

17 0

Name: labelnum, dtype: int64

(5,)

(13,)

(5,)

(13,)

Accuracy metrics

Accuracy of the classifer is 0.8

Confusion matrix

[[3 1]

[0 1]]

Recall and Precison

1.0

0.5

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

19

RESULT:
Thus the implementation of Naive Bayesian Classifier algorithm has been done
successfully.

EX NO 5: IMPLEMENTATION OF NAIVE BAYESIAN CLASSIFIER MODEL TO


CLASSIFY A SET OF DOCUMENTS

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

20

AIM:
To implement the Naïve Bayesian Classifier Model to Classify the document set using
python.
ALGORITHM:

Step 1: Input the total Number of Documents from the user.


Step 2: Input the text and class of Each document and split it into a List.
Step 3: Create a 2D array and append each document list into an array
Step 4: Using a Set data structure, store all the keywords in a list.
Step 5: Input the text to be classified by the user.

PROGRAM:
import csv
import random
import math

def loadCsv(filename):
lines = csv.reader(open(filename, "r"));
dataset = list(lines)
for i in range(len(dataset)):
#converting strings into numbers for processing
dataset[i] = [float(x) for x in dataset[i]]

return dataset

def splitDataset(dataset, splitRatio):


#67% training size
trainSize = int(len(dataset)* splitRatio);
trainSet = []
copy = list(dataset);
while len(trainSet) < trainSize:
#generate indices for the dataset list randomly to pick ele for training data

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

21

index = random.randrange(len(copy));
trainSet.append(copy.pop(index))
return [trainSet, copy]
def separateByClass(dataset):
separated = {}
#creates a dictionary of classes 1 and 0 where the values are the instacnes
belonging to each class
for i in range(len(dataset)):
vector = dataset[i]
if (vector[-1] not in separated):
separated[vector[-1]] = []
separated[vector[-1]].append(vector)
return separated

def mean(numbers):
return sum(numbers)/float(len(numbers))

def stdev(numbers):
avg = mean(numbers)
variance = sum([pow(x-avg,2) for x in numbers])/float(len(numbers)-1)
return math.sqrt(variance)
def summarize(dataset):
summaries = [(mean(attribute), stdev(attribute)) for attribute in zip(*dataset)];
del summaries[-1]
return summaries

def summarizeByClass(dataset):
separated = separateByClass(dataset);
summaries = {}
for classValue, instances in separated.items():
#summaries is a dic of tuples(mean,std) for each class value
summaries[classValue] = summarize(instances)
return summaries

def calculateProbability(x, mean, stdev):


exponent = math.exp(-(math.pow(x-mean,2)/(2*math.pow(stdev,2))))
return (1 / (math.sqrt(2*math.pi) * stdev)) * exponent

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

22

def calculateClassProbabilities(summaries, inputVector):


probabilities = {}
for classValue, classSummaries in summaries.items():#class and attribute information
as mean and sd
probabilities[classValue] = 1
for i in range(len(classSummaries)):
mean, stdev = classSummaries[i] #take mean and sd of every attribute
for class 0 and 1 seperaely
x = inputVector[i] #testvector's first attribute
probabilities[classValue] *= calculateProbability(x, mean, stdev);#use
normal dist
return probabilities

def predict(summaries, inputVector):


probabilities = calculateClassProbabilities(summaries, inputVector)
bestLabel, bestProb = None, -1
for classValue, probability in probabilities.items():#assigns that class which has he
highest prob
if bestLabel is None or probability > bestProb:
bestProb = probability
bestLabel = classValue
return bestLabel

def getPredictions(summaries, testSet):


predictions = []
for i in range(len(testSet)):
result = predict(summaries, testSet[i])
predictions.append(result)
return predictions
def getAccuracy(testSet, predictions):
correct = 0
for i in range(len(testSet)):
if testSet[i][-1] == predictions[i]:
correct += 1
return (correct/float(len(testSet))) * 100.0

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

23

def main():
filename = '5data.csv' splitRatio = 0.67
dataset = loadCsv(filename);

trainingSet, testSet = splitDataset(dataset, splitRatio)


print('Split {0} rows into train={1} and test={2} rows'.format(len(dataset),
len(trainingSet), len(testSet)))
# prepare model
summaries = summarizeByClass(trainingSet);
# test model
predictions = getPredictions(summaries, testSet)
accuracy = getAccuracy(testSet, predictions)
print('Accuracy of the classifier is : {0}%'.format(accuracy))

main()

OUTPUT:
confusion matrix is as
follows [[17 0 0]
[ 0 17 0]
[ 0 0 11]]
Accuracy metrics
precision recall f1-score support

0 1.00 1.00 1.00 17


1 1.00 1.00 1.00 17
2 1.00 1.00 1.00 11
avg / total 1.00 1.00 1.00 45

RESULT:
Thus the implementation of Naïve Bayesian Classifier model has been done successfully.

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

24

EX NO 6: CONSTRUCTING A BAYESIAN NETWORK TO DIAGNOSE AN


INFECTION USING WHO DATA SET.

AIM:
To implement a Bayesian Network to diagnose an infection with WHO dataset using
python script.
ALGORITHM:
Step 1:
Step 2:
Step 3:
Step 4:
Step 5:
Step 6:
Step 7:
Step 8:
Step 9:

PROGRAM:

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

25

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

26

OUTPUT:

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

27

RESULT:
To the implementation of a Bayesian Network to diagnose an infection with WHO dataset
has been done successfully

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

28

EX NO: 7 IMPLEMENTATION OF EM ALGORITHM TO CLUSTER A SET OF


DATA

AIM:
To implement EM algorithm to cluster a data set using python.

ALGORITHM:
Step 1: Identify the variable in which the set of attributes are specified in the data set

Step 2: Determine the domain of each variable to take from the set of values.

Step 3: Create a directed graph network or node where each node represents the attributes
and edges represents child relationship.

Step 4: Determine the prior and conditional probability for each attribute

Step 5: Perform the inference on the module and determine the marginal probability.

PROGRAM:
import numpy as np

from sklearn.cluster import KMeans

import matplotlib.pyplot as plt

from sklearn.mixture import GaussianMixture

import pandas as pd

X=pd.read_csv("kmeansdata.csv")

x1 = X['Distance_Feature'].values

x2 = X['Speeding_Feature'].values

X = np.array(list(zip(x1, x2))).reshape(len(x1), 2)

plt.plot()

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

29

plt.xlim([0, 100])

plt.ylim([0, 50])

plt.title('Dataset')

plt.scatter(x1, x2)

plt.show()

#code for EM

gmm = GaussianMixture(n_components=3)

gmm.fit(X)

em_predictions = gmm.predict(X)

print("\nEM predictions")

print(em_predictions)

print("mean:\n",gmm.means_)

print('\n')

print("Covariances\n",gmm.covariances_)

print(X)

plt.title('Exceptation Maximum')

plt.scatter(X[:,0], X[:,1],c=em_predictions,s=50)

plt.show()

#code for Kmeans

import matplotlib.pyplot as plt1

kmeans = KMeans(n_clusters=3)

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

30

kmeans.fit(X)

print(kmeans.cluster_centers_)

print(kmeans.labels_)

plt.title('KMEANS')

plt1.scatter(X[:,0], X[:,1], c=kmeans.labels_, cmap='rainbow')

plt1.scatter(kmeans.cluster_centers_[:,0] ,kmeans.cluster_centers_[:,1], color='black')

OUTPUT:
EM predictions
[0 0 0 1 0 1 1 1 2 1 2 2 1 1 2 1 2 1 0 1 0 1 1]
mean: [[57.70629058 25.73574491][52.12044022 22.46250453]
[46.4364858 39.43288647]]
Covariances [[[83.51878796 14.926902 ] [14.926902 2.70846907]] [[29.95910352 15.83416554]
[15.83416554 67.01175729]]
[[79.34811849 29.55835938] [29.55835938 18.17157304]]] [[71.24 28. ] [52.53 25. ] [64.54 27. ]
[55.69 22. ] [54.58 25. ] [41.91 10. ] [58.64 20. ] [52.02 8. ] [31.25 34. ] [44.31 19. ] [49.35 40. ]
[58.07 45. ] [44.22 22. ] [55.73 19. ] [46.63 43. ] [52.97 32. ] [46.25 35. ] [51.55 27. ] [57.05 26. ]
[58.45 30. ] [43.42 23. ] [55.68 37. ] [55.15 18. ][[57.74090909 24.27272727] [48.6 38. ] [45.176
16.4 ]]
[0 0 0 0 0 2 0 2 1 2 1 1 2 0 1 1 1 0 0 0 2 1 0]

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

31

RESULT:
Thus the EM Algorithm to cluster a data set has been implemented successfully.

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

32

EX NO 8: IMPLEMENTATION OF K-NEAREST NEIGHBOUR


ALGORITHM TO CLASSIFY IRIS DATASET
AIM:
To implement the K-Nearest Neighbour Algorithm to classify the Dataset using python

ALGORITHM:
Step 1: Start the Program

Step 2: Importing the Modules.

Step 3: Creating dataset, scikit_learn has a lot of tools for creating synthetic datasets.

Step 4: Visualize the dataset

Step 5: Splitting data into training and testing dataset.

Step 6: Build a KNN classifier object for the implementation.

Step 7: Predictions for the KNN Classifier, then in the test set, we forecast the target
values and compare them to the actual values.

Step 8: Predict Accuracy for both K-values

Step 9: Visualize Predictions

Step 10: Stop the Program.

PROGRAM:
import numpy as np
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

33

from sklearn import metrics


from sklearn.datasets import load_iris
iris=load_iris()
iris.keys()
df=pd.DataFrame(iris['data'])
X=df
y=iris['target']
print(X.head())
Xtrain, Xtest, ytrain, ytest = train_test_split(X, y, test_size=0.10)

classifier = KNeighborsClassifier(n_neighbors=3).fit(Xtrain, ytrain)

ypred = classifier.predict(Xtest)

i=0
print ("\n-------------------------------------------------------------------------")
print ('%-25s %-25s %-25s' % ('Original Label', 'Predicted Label', 'Correct/Wrong'))
print ("-------------------------------------------------------------------------")
for label in ytest:
print ('%-25s %-25s' % (label, ypred[i]), end="")
if (label == ypred[i]):
print (' %-25s' % ('Correct'))
else:
print (' %-25s' % ('Wrong'))
i=i+1
print ("-------------------------------------------------------------------------")
print("\nConfusion Matrix:\n",metrics.confusion_matrix(ytest, ypred))
print ("-------------------------------------------------------------------------")
print("\nClassification Report:\n",metrics.classification_report(ytest, ypred))
print ("-------------------------------------------------------------------------")
print('Accuracy of the classifer is %0.2f' % metrics.accuracy_score(ytest,ypred))
print ("-------------------------------------------------------------------------")

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

34

OUTPUT:
0 1 2 3
5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
3 4.6 3.1 1.5 0.2
4 5.0 3.6 1.4 0.2
-------------------------------------------------------------------------

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

35

Original Label Predicted Label Correct/Wrong


0 2 Correct
1 1 Correct
2 2 Correct
3 0 Correct
0 0 Correct
1 1 Correct
2 2 Correct
2 2 Correct
0 0 Correct
0 0 Correct
0 0 Correct
1 1 Correct
2 2 Correct
1 1 Correct
1 1 Correct
Confusion Matrix:
[[5 0 0]
[0 5 0]
[0 0 5]]
Classi昀椀ca琀椀on Report:
precision recall f1-score support
0 1.00 1.00 1.00 5
1 1.00 1.00 1.00 5
2 1.00 1.00 1.00 5
accuracy 1.00 15
macro avg 1.00 1.00 1.00 15
weighted avg 1.00 1.00 1.00 15
-------------------------------------------------------------------------
Accuracy of the classifer is 1.00\n
-------------------------------------------------------------------------

RESULT:

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

36

Thus the K-Nearest Neighbour Algorithm to classify the data set using Python has been
implemented successfully.

EX NO 9: IMPLEMENTATION OF NON-PARAMETRIC
LOCALLY WEIGHTED REGRESSION ALGORITHM
AIM:

To implement the non-parametric locally weighted regression algorithm using python.

ALGORITHM:

Step 1:

Step 2:

Step 3:

Step 4:

Step 5:

Step 6:

Step 7:

Step 8:

Step 9:

PROGRAM:

import matplotlib.pyplot as plt


import pandas as pd
import numpy as np

def kernel(point, xmat, k):


m,n = np.shape(xmat)

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

37

weights = np.mat(np.eye((m)))
for j in range(m):
diff = point - X[j]
weights[j,j] = np.exp(diff*diff.T/(-2.0*k**2))
return weights

def localWeight(point, xmat, ymat, k):


wei = kernel(point,xmat,k)
W = (X.T*(wei*X)).I*(X.T*(wei*ymat.T))
return W

def localWeightRegression(xmat, ymat, k):


m,n = np.shape(xmat)
ypred = np.zeros(m)
for i in range(m):
ypred[i] = xmat[i]*localWeight(xmat[i],xmat,ymat,k)
return ypred

# load data points


data = pd.read_csv("/Users/HP/Downloads/10-dataset.csv")
bill = np.array(data.total_bill)
tip = np.array(data.tip)

#preparing and add 1 in bill


mbill = np.mat(bill)
mtip = np.mat(tip)

m= np.shape(mbill)[1]
one = np.mat(np.ones(m))
X = np.hstack((one.T,mbill.T))

#set k here
ypred = localWeightRegression(X,mtip,0.5)
SortIndex = X[:,1].argsort(0)
xsort = X[SortIndex][:,0]

fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.scatter(bill,tip, color='green')

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

38

ax.plot(xsort[:,1],ypred[SortIndex], color = 'red', linewidth=5)


plt.xlabel('Total bill')
plt.ylabel('Tip')
plt.show()

OUTPUT:

RESULT:

Thus the non parametric locally weighted regression algorithm has been implemented
successfully.

EX NO 10: IMPLEMENTATION OF REGRESSION


ALGORITHM
AIM:

To implement the Regression algorithm using python script.

ALGORITHM:

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

39

Step 1:

Step 2:

Step 3:

Step 4:

Step 5:

Step 6:

Step 7:

Step 8:

Step 9:

PROGRAM:

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

40

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

41

RESULT:

Thus the Regression Algorithm using Python has been implemented successfully.

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

42

EX NO 11: IMPLEMENTATION OF FIND S ALGORITHM


AIM:

To implement the Find-S Algorithm using python script.

ALGORITHM:

Step 1:

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

43

Step 2:

Step 3:

Step 4:

Step 5:

Step 6:

Step 7:

Step 8:

Step 9:

PROGRAM:

Downloaded by Sharmi Rks (sharmirks74@gmail.com)


lOMoARcPSD|47101843

44

RESULT:

Thus Find S Algorithm using Python has been implemented successfully.

Downloaded by Sharmi Rks (sharmirks74@gmail.com)

You might also like