ML Programs for Internal Lab Exams
1. a) The probability that it is Friday and that a student is absent is 3 %. Since there are 5
school days in a week, the probability that it is Friday is 20 %. What is the probability
that a student is absent given that today is Friday?
Program:
pAF=0.03
print("The probability that it is Friday and that a student is absent :",pAF)
pF=0.2
print("The probability that it is Friday : ",pF)
pResult=(pAF/pF)
print("The probability that a student is absent given that today is Friday : ",pResult * 100,"%")
Output:
The probability that it is Friday and that a student is absent : 0.03
The probability that it is Friday : 0.2
The probability that a student is absent given that today is Friday : 15.0 %
b) Given the following data, which specify classifications for nine combinations of VAR1
and VAR2 predict a classification for a case where VAR1=0.906 and VAR2=0.606,
using the result of kmeans clustering with 3 means (i.e., 3 centroids)
Program:
from sklearn.cluster import KMeans
import numpy as np
X = np.array([[1.713,1.586], [0.180,1.786], [0.353,1.240],
[0.940,1.566], [1.486,0.759], [1.266,1.106],[1.540,0.419],[0.459,1.799],[0.773,0.186]])
y=np.array([0,1,1,0,1,0,1,1,1])
kmeans = KMeans(n_clusters=3, random_state=0).fit(X,y)
print("The input data is ")
print("VAR1 \t VAR2 \t CLASS")
i=0
for val in X:
print(val[0],"\t",val[1],"\t",y[i])
i+=1
print("="*20)
# To get test data from the user
print("The Test data to predict ")
test_data = []
VAR1 = float(input("Enter Value for VAR1 :"))
VAR2 = float(input("Enter Value for VAR2 :"))
test_data.append(VAR1)
test_data.append(VAR2)
print("="*20)
print("The predicted Class is : ",kmeans.predict([test_data]))
Output:
The input data is
VAR1 VAR2 CLASS
1.713 1.586 0
0.18 1.786 1
0.353 1.24 1
0.94 1.566 0
1.486 0.759 1
1.266 1.106 0
1.54 0.419 1
0.459 1.799 1
0.773 0.186 1
====================
The Test data to predict
Enter Value for VAR1 :0.906
Enter Value for VAR2 :0.606
====================
The predicted Class is : [0]
2. Extract the data from database using python.
** Write steps, program & output in exam**
Steps to execute 2nd program:
a) Install MYSQL Workbench
b) First You need to Create a Table (students) in Mysql Database (studentdb) and insert data into
the table.
---CREATE SCHEMA `studentdb`;
---CREATE TABLE `studentdb`.`students` (`sid` INT NOT NULL,`sname` VARCHAR(45) NOT
NULL,`age` INT NOT NULL,PRIMARY KEY (`sid`));
---INSERT INTO `studentdb`.`students` (`sid`, `sname`, `age`) VALUES ('01', 'abc', '20');
---INSERT INTO `studentdb`.`students` (`sid`, `sname`, `age`) VALUES ('02', 'xyz', '21');
---INSERT INTO `studentdb`.`students` (`sid`, `sname`, `age`) VALUES ('03', 'lmn', '19');
c) Install Python
d) Next, Open Command prompt and then execute the following command to connect with
mysql database through python.
--> pip install pymysql (Windows)
e) Open Vs Code or any code editor and save the file as dbconnect.py , run the program
Program: dbconnect.py
import pymysql
connection = pymysql.connect( host='localhost', user='root', password='admin',
database='studentdb' )
cursor = connection.cursor()
cursor.execute('SELECT * FROM students;')
rows = cursor.fetchall()
for row in rows:
print(row)
cursor.close()
connection.close()
Output:
c:/Users/lenovo/Desktop/dbconnect.py
(1, 'abc', 20)
(2, 'xyz', 21)
(3, 'lmn', 19)
3. a) Implement k-nearest neighbors classification using python.
Program:
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
irisData = load_iris()
X = irisData.data
y = irisData.target
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size = 0.2, random_state=42)
knn = KNeighborsClassifier(n_neighbors=7)
knn.fit(X_train, y_train)
print(knn.predict(X_test))
Output:
[1 0 2 1 1 0 1 2 2 1 2 0 0 0 0 1 2 1 1 2 0 2 0 2 2 2 2 2 0 0]
b) Find the unconditional probability of `golf' and the conditional probability of `single'
given `medRisk' in the dataset?
Program:
total_Records=10
numGolfRecords=4
unConditionalprobGolf=numGolfRecords / total_Records
print("Unconditional probability of golf: ={}".format(unConditionalprobGolf))
#conditional probability of 'single' given 'medRisk'
numMedRiskSingle=2
numMedRisk=3
probMedRiskSingle=numMedRiskSingle/total_Records
probMedRisk=numMedRisk/total_Records
conditionalProb=(probMedRiskSingle/probMedRisk)
print("Conditional probability of single given medRisk: = {}".format(conditionalProb))
Output:
Unconditional probability of golf: =0.4
Conditional probability of single given medRisk: = 0.6666666666666667
4. Implement linear regression using python.
Program:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# To read data from Age_Income.csv file
dataFrame = pd.read_csv('/Age_Income.csv')
# To place data in to age and income vectors
age = dataFrame['Age']
income = dataFrame['Income']
# number of points
num = np.size(age)
# To find the mean of age and income vector
mean_age = np.mean(age)
mean_income = np.mean(income)
# calculating cross-deviation and deviation about age
CD_ageincome = np.sum(income*age) - num*mean_income*mean_age
CD_ageage = np.sum(age*age) - num*mean_age*mean_age
# calculating regression coefficients
b1 = CD_ageincome / CD_ageage
b0 = mean_income - b1*mean_age
# to display coefficients
print("Estimated Coefficients :")
print("b0 = ",b0,"\nb1 = ",b1)
# To plot the actual points as scatter plot
plt.scatter(age, income, color = "b",marker = "o")
# TO predict response vector
response_Vec = b0 + b1*age
# To plot the regression line
plt.plot(age, response_Vec, color = "r")
# Placing labels
plt.xlabel('Age')
plt.ylabel('Income')
# To display plot
plt.show()
Output:
Estimated Coefficients :
b0 = -14560.45016077166
b1 = 1550.7923748277433
5. Implement Naïve Bayes theorem to classify the English text
Program:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
# Naive Bayes Module
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, confusion_matrix, precision_score, recall_score
msglbl_data = pd.read_csv('/Statements_data.csv', names=['Message', 'Label'])
print("The Total instances in the Dataset: ", msglbl_data.shape[0])
msglbl_data['labelnum'] = msglbl_data.Label.map({'pos': 1, 'neg': 0})
# place the data in X and Y Vectors
X = msglbl_data["Message"]
Y = msglbl_data.labelnum
# to split the data into train se and test set
Xtrain, Xtest, Ytrain, Ytest = train_test_split(X, Y)
count_vect = CountVectorizer()
Xtrain_dims = count_vect.fit_transform(Xtrain)
Xtest_dims = count_vect.transform(Xtest)
df = pd.DataFrame(Xtrain_dims.toarray(),columns=count_vect.get_feature_names_out())
clf = MultinomialNB()
# to fit the train data into model
clf.fit(Xtrain_dims, Ytrain)
# to predict the test data
prediction = clf.predict(Xtest_dims)
print('******** Accuracy Metrics *********')
print('Accuracy : ', accuracy_score(Ytest, prediction))
print('Recall : ', recall_score(Ytest, prediction))
print('Precision : ',precision_score(Ytest, prediction))
print('Confusion Matrix : \n', confusion_matrix(Ytest, prediction))
print(10*"-")
# to predict the input statement
test_stmt = [input("Enter any statement to predict :")]
test_dims = count_vect.transform(test_stmt)
pred = clf.predict(test_dims)
for stmt,lbl in zip(test_stmt,pred):
if lbl == 1:
print("Statement is Positive")
else:
print("Statement is Negative")
Output:
The Total instances in the Dataset: 18
******** Accuracy Metrics *********
Accuracy : 0.2
Recall : 0.5
Precision : 0.25
Confusion Matrix :
[[0 3]
[1 1]]
----------
Enter any statement to predict : I Love Goa
Statement is Positive
6. Implement the finite words classification system using Back-propagation algorithm
Program:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
#Neural Network Module
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score, confusion_matrix, precision_score, recall_score
msglbl_data = pd.read_csv('/Statements_data.csv', names=['Message', 'Label'])
print("The Total instances in the Dataset: ", msglbl_data.shape[0])
msglbl_data['labelnum'] = msglbl_data.Label.map({'pos': 1, 'neg': 0})
# place the data in X and Y Vectors
X = msglbl_data["Message"]
Y = msglbl_data.labelnum
# to split the data into train se and test set
Xtrain, Xtest, Ytrain, Ytest = train_test_split(X, Y)
count_vect = CountVectorizer()
Xtrain_dims = count_vect.fit_transform(Xtrain)
Xtest_dims = count_vect.transform(Xtest)
df = pd.DataFrame(Xtrain_dims.toarray(),columns=count_vect.get_feature_names_out())
clf = MLPClassifier(solver='lbfgs', alpha=1e-5,hidden_layer_sizes=(5, 2), random_state=1)
# to fit the train data into model
clf.fit(Xtrain_dims, Ytrain)
# to predict the test data
prediction = clf.predict(Xtest_dims)
print('******** Accuracy Metrics *********')
print('Accuracy : ', accuracy_score(Ytest, prediction))
print('Recall : ', recall_score(Ytest, prediction))
print('Precision : ',precision_score(Ytest, prediction))
print('Confusion Matrix : \n', confusion_matrix(Ytest, prediction))
print(10*"-")
# to predict the input statement
test_stmt = [input("Enter any statement to predict :")]
test_dims = count_vect.transform(test_stmt)
pred = clf.predict(test_dims)
for stmt,lbl in zip(test_stmt,pred):
if lbl == 1:
print("Statement is Positive")
else:
print("Statement is Negative")
Output:
The Total instances in the Dataset: 18
******** Accuracy Metrics *********
Accuracy : 0.4
Recall : 0.6666666666666666
Precision : 0.5
Confusion Matrix :
[[0 2]
[1 2]]
----------
Enter any statement to predict : i do not like winter
Statement is Negative