KEMBAR78
Machine Learning Practicals | PDF | Computers | Technology & Engineering
0% found this document useful (0 votes)
199 views30 pages

Machine Learning Practicals

Uploaded by

kanika Mittal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
199 views30 pages

Machine Learning Practicals

Uploaded by

kanika Mittal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

MAHARAJA SURAJMAL INSTITUTE

Affiliated to GGSIP University & NAAC ‘A+’ grade accredited

DEPARTMENT OF COMPUTER APPLICATIONS

Machine Learning
PRACTICAL FILE
SUBJECT CODE – BCAP 311

Submitted by : Kanika Mittal Submitted to :- Dr. Anamika Rana


Enrollment no : 00121202021 Associate Professor , MSI
Sem : 5th Sec : A (2nd shift) Sign :- ____________
INDEX
S.No. Practical Date Sign

1. Write a program in python to implement Linear 11/09/23


Regression with one variable.
2. Write a program in python to implement Linear 11/09/23
Regression with multiple variables.
3. Write a program in python to implement Logistic 25/09/23
Regression.
4. Write a program in python to implement SVM 25/10/23
Classifier.
5. Write a program in python to implement KNN 09/10/23
Classifier.
6. Write a program in python to implement a 09/10/23
Decision Tree Classifier.
7. Write a program in python to implement the Naïve 16/10/23
Bayes Classifier.
8. Write a program in python to implement the 16/10/23
Random Forest Classifier.
9. Build an Artificial Neural Network (ANN) by 21/10/23
implementing the Back Propagation Algorithm.
10. Write a program in python to implement K-means 28/10/23
Algorithm.
11. Write a program in python on Self Organising Map 18/11/23
(SOM).
12. Write a program in python for Empirical 23/11/23
Comparison of different Supervised learning
techniques.
13. Write a program in python for Empirical 23/11/23
Comparison of different Unsupervised learning
techniques.
Practical – 1

Ques. Write a program in python to implement Linear Regression with one


variable.

Code :-
# Import necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Load the diabetes dataset


diabetes = load_diabetes()
X = diabetes.data[:, np.newaxis, 2]
y = diabetes.target

# Split the dataset into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

# Create a linear regression model


model = LinearRegression()

# Train the model on the training set


model.fit(X_train, y_train)

# Make predictions on the testing set


y_pred = model.predict(X_test)

# Evaluate the model


mse = mean_squared_error(y_test, y_pred)
print(f'Mean Squared Error: {mse:.2f}')

# Plot the regression line


plt.scatter(X_test, y_test, color='red', label='Actual Data')
plt.plot(X_test, y_pred, color='blue', label='Regression Line')
plt.xlabel('Feature')
plt.ylabel('Target')
plt.title('Linear Regression with One Variable')
plt.show()
Output :-
Practical – 2

Ques. Write a program in python to implement Linear Regression with


multiple variables.

Code :-
# import all the necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

# Load the diabetes dataset


data = load_diabetes()
X, y = data.data, data.target

# Split the data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

# Initialize and train the linear regression model


model = LinearRegression()
model.fit(X_train, y_train)

# Make predictions on the test set


y_pred = model.predict(X_test)

# Calculate the Mean Squared Error (MSE) as a measure of performance


mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse:.2f}")
print(f"R2 Score: {r2_score(y_test, y_pred):.2f}")

# Plot the regression line


plt.figure(figsize=(8, 6))
plt.scatter(y_test, y_pred, color='blue', label='Predicted Values')
plt.plot([min(y_test), max(y_test)], [min(y_test), max(y_test)], color='red',
linewidth=2, label='Regression Line')
plt.xlabel('Actual Values')
plt.ylabel('Predicted Values')
plt.title('Linear Regression with Multiple Variables')
plt.legend()
plt.show()
Output :-
Practical – 3

Ques. Write a program in python to implement Logistic Regression.

Code :-
# import all the necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_wine, load_iris
from sklearn.metrics import confusion_matrix, classification_report,
accuracy_score
import seaborn as sns
from sklearn.preprocessing import StandardScaler

#load iris dataset


iris=load_iris()

# Selecting features (X) and target variable (y)


X = iris.data
y = iris.target

# Split the dataset into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Initialize the logistic regression model


model = LogisticRegression()

# scale the data


scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Train the model


model.fit(X_train, y_train)

# Make predictions on the test set


y_pred = model.predict(X_test)

# print classification report


print(classification_report(y_test, y_pred))
print("Accuracy : ",accuracy_score(y_test, y_pred)*100,'%')
# Plot the confusion matrix heatmap
conf_matrix = confusion_matrix(y_test, y_pred)
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues',
xticklabels=['Predicted 0', 'Predicted 1'],
yticklabels=['Actual 0', 'Actual 1'])
plt.title('Confusion Matrix')
plt.show()

Output :-
Practical – 4

Ques. Write a program in python to implement SVM Classifier.

Code :-
# import all the necessary libraries
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.datasets import load_wine
from sklearn.metrics import confusion_matrix, classification_report,
accuracy_score
import seaborn as sns
from sklearn.preprocessing import StandardScaler

#load wine dataset


wine=load_wine()

# Selecting features (X) and target variable (y)


X = wine.data
y = wine.target

# Split the dataset into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Initialize the SVM model


model = SVC()

# scale the data


scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Train the model


model.fit(X_train, y_train)

# Make predictions on the test set


y_pred = model.predict(X_test)

# print classification report


print(classification_report(y_test, y_pred))
print("Accuracy : ",accuracy_score(y_test, y_pred)*100,'%')

# Plot the confusion matrix heatmap


conf_matrix = confusion_matrix(y_test, y_pred)
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues',
xticklabels=['Predicted 0', 'Predicted 1', 'Predicted 2'],
yticklabels=['Actual 0', 'Actual 1', 'Actual 2'])
plt.title('Confusion Matrix')
plt.show()

Output :-
Practical – 5

Ques. Write a program in python to implement KNN Classifier.

Code :-
#WAP to implement KNN algorithm on iris dataset
# import all the necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.datasets import load_iris
from sklearn.metrics import confusion_matrix, classification_report,
accuracy_score
import seaborn as sns
from sklearn.preprocessing import StandardScaler

#load iris dataset


iris=load_iris()

# Selecting features (X) and target variable (y)


X = iris.data
y = iris.target

# Split the dataset into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Initialize the KNN model


model = KNeighborsClassifier()

# scale the data


scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Train the model


model.fit(X_train, y_train)

# Make predictions on the test set


y_pred = model.predict(X_test)

# print classification report


print(classification_report(y_test, y_pred))
print("Accuracy : ",accuracy_score(y_test, y_pred)*100,'%')
# Plot the confusion matrix heatmap
conf_matrix = confusion_matrix(y_test, y_pred)
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues',
xticklabels=['Predicted 0', 'Predicted 1', 'Predicted 2'],
yticklabels=['Actual 0', 'Actual 1', 'Actual 2'])
plt.title('Confusion Matrix')
plt.show()

Output :-
Practical – 6

Ques. Write a program in python to implement a Decision Tree Classifier.

Code :-
# import all the necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.metrics import confusion_matrix, classification_report,
accuracy_score
import seaborn as sns
from sklearn.preprocessing import StandardScaler

#load iris dataset


iris=load_iris()

# Selecting features (X) and target variable (y)


X = iris.data
y = iris.target

# Split the dataset into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Initialize the Decision Tree model


model = DecisionTreeClassifier()

# scale the data


scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Train the model


model.fit(X_train, y_train)

# Make predictions on the test set


y_pred = model.predict(X_test)

# print classification report


print(classification_report(y_test, y_pred))
print("Accuracy : ",accuracy_score(y_test, y_pred)*100,'%')
# Plot the confusion matrix heatmap
conf_matrix = confusion_matrix(y_test, y_pred)
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues',
xticklabels=['Predicted 0', 'Predicted 1', 'Predicted 2'],
yticklabels=['Actual 0', 'Actual 1', 'Actual 2'])
plt.title('Confusion Matrix')
plt.show()

Output :-
Practical – 7

Ques. Write a program in python to implement the Naïve Bayes Classifier.

Code :-
#WAP to implement Naive Bayesian algorithm on iris dataset
# import all the necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.datasets import load_iris
from sklearn.metrics import confusion_matrix, classification_report,
accuracy_score
import seaborn as sns
from sklearn.preprocessing import StandardScaler

#load iris dataset


iris=load_iris()

# Selecting features (X) and target variable (y)


X = iris.data
y = iris.target

# Split the dataset into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Initialize the Naive Bayes model


model = GaussianNB()

# scale the data


scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Train the model


model.fit(X_train, y_train)

# Make predictions on the test set


y_pred = model.predict(X_test)

# print classification report


print(classification_report(y_test, y_pred))
print("Accuracy : ",accuracy_score(y_test, y_pred)*100,'%')
# Plot the confusion matrix heatmap
conf_matrix = confusion_matrix(y_test, y_pred)
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues',
xticklabels=['Predicted 0', 'Predicted 1', 'Predicted 2'],
yticklabels=['Actual 0', 'Actual 1', 'Actual 2'])
plt.title('Confusion Matrix')
plt.show()

Output :-
Practical – 8

Ques. Write a program in python to implement the Random Forest Classifier.

Code :-
#WAP to implement Random Forest algorithm on iris dataset
# import all the necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.metrics import confusion_matrix, classification_report,
accuracy_score
import seaborn as sns
from sklearn.preprocessing import StandardScaler

#load iris dataset


iris=load_iris()

# Selecting features (X) and target variable (y)


X = iris.data
y = iris.target

# Split the dataset into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=5)

# Initialize the Random Forest model


model = RandomForestClassifier()

# scale the data


scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Train the model


model.fit(X_train, y_train)

# Make predictions on the test set


y_pred = model.predict(X_test)

# print classification report


print(classification_report(y_test, y_pred))
print("Accuracy : ",accuracy_score(y_test, y_pred)*100,'%')
# Plot the confusion matrix heatmap
conf_matrix = confusion_matrix(y_test, y_pred)

sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues',


xticklabels=['Predicted 0', 'Predicted 1', 'Predicted 2'],
yticklabels=['Actual 0', 'Actual 1', 'Actual 2'])

plt.title('Confusion Matrix')
plt.show()

Output :-
Practical – 9

Ques. Build an Artificial Neural Network (ANN) by implementing the Back


Propagation Algorithm.

Code :-
# import all the necessary libraries
import numpy as np
from sklearn.datasets import load_breast_cancer, load_iris, load_wine
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score
import numpy as np

# Load breast cancer dataset


breast_cancer = load_breast_cancer()
X, y = breast_cancer.data, breast_cancer.target

# Normalize the data


scaler = StandardScaler()
X = scaler.fit_transform(X)

# Split the data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.25)

# Neural Network parameters


input_size = X_train.shape[1]
hidden_size = 5
output_size = 1
learning_rate = 0.01
epochs = 1000

# Initialize weights and biases


np.random.seed(42)
weights_input_hidden = np.random.randn(input_size, hidden_size)
biases_hidden = np.zeros((1, hidden_size))
weights_hidden_output = np.random.randn(hidden_size, output_size)
biases_output = np.zeros((1, output_size))

# Sigmoid activation function and its derivative


def sigmoid(x):
return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
return x * (1 - x)

# Training the neural network using backpropagation


for epoch in range(epochs):
# Forward pass
hidden_input = np.dot(X_train, weights_input_hidden) +
biases_hidden
hidden_output = sigmoid(hidden_input)
final_input = np.dot(hidden_output, weights_hidden_output) +
biases_output
predicted_output = sigmoid(final_input)

# Compute the loss


loss = y_train.reshape(-1, 1) - predicted_output

# Backpropagation
output_error = loss * sigmoid_derivative(predicted_output)
hidden_layer_error = output_error.dot(weights_hidden_output.T) *
sigmoid_derivative(hidden_output)

# Update weights and biases


weights_hidden_output += hidden_output.T.dot(output_error) *
learning_rate
biases_output += np.sum(output_error, axis=0, keepdims=True) *
learning_rate
weights_input_hidden += X_train.T.dot(hidden_layer_error) *
learning_rate
biases_hidden += np.sum(hidden_layer_error, axis=0, keepdims=True)
* learning_rate

# Make predictions on the test set


hidden_input = np.dot(X_test, weights_input_hidden) + biases_hidden
hidden_output = sigmoid(hidden_input)
final_input = np.dot(hidden_output, weights_hidden_output) +
biases_output
predicted_output = sigmoid(final_input)

# Convert predicted probabilities to binary predictions (0 or 1)


binary_predictions = (predicted_output > 0.5).astype(int)
# Evaluate the accuracy
accuracy = accuracy_score(y_test, binary_predictions)
print(f"Accuracy: {100*accuracy:.2f}%")

Output :-
Practical – 10

Ques. Write a program in python to implement K-means Algorithm.

Code :-
# import all the necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans, AgglomerativeClustering
from sklearn.metrics import silhouette_score
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.datasets import load_wine

# Load dataset
wine = load_wine()
X = wine.data
y = wine.target

# Standardize the features


scaler = StandardScaler()
X_std = scaler.fit_transform(X)

# Apply K-means clustering


kmeans = KMeans(n_clusters=4)
kmeans_labels = kmeans.fit_predict(X_std)

# Apply Hierarchical clustering


agg_clustering = AgglomerativeClustering(n_clusters=4)
agg_labels = agg_clustering.fit_predict(X_std)

# Evaluate clustering quality using silhouette score


kmeans_silhouette = silhouette_score(X_std, kmeans_labels)
agg_silhouette = silhouette_score(X_std, agg_labels)

# Print silhouette scores


print(f"K-means Silhouette Score: {kmeans_silhouette}")
print(f"Hierarchical Silhouette Score: {agg_silhouette}")

# Visualize the clustering results using PCA for dimensionality


reduction
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X_std)

plt.figure(figsize=(12, 5))

plt.subplot(1, 2, 1)
for cluster in range(4):
plt.scatter(X_pca[kmeans_labels == cluster, 0], X_pca[kmeans_labels
== cluster, 1], label=f'Cluster {cluster + 1}')

plt.title('K-means Clustering')
plt.legend()

plt.subplot(1, 2, 2)
for cluster in range(4):
plt.scatter(X_pca[agg_labels == cluster, 0], X_pca[agg_labels ==
cluster, 1], label=f'Cluster {cluster + 1}')

plt.title('Hierarchical Clustering')
plt.legend()
plt.show()

Output :-
Practical – 11

Ques. Write a program in python on Self Organising Map (SOM).

Code :-
# import all the required libraries
from sklearn.datasets import load_iris
from sklearn.preprocessing import MinMaxScaler
from minisom import MiniSom
import matplotlib.pyplot as plt

data = load_iris()
X, y = data.data, data.target

# Normalize the data


scaler = MinMaxScaler()
X = scaler.fit_transform(X)

# SOM parameters
som_grid_size = (10, 10) # Grid size of the SOM
input_size = X.shape[1] # Number of features in the input data
learning_rate = 0.5 # Initial learning rate
sigma = 1.0 # Initial neighborhood radius
epochs = 1000 # Number of training epochs

# Initialize SOM
som = MiniSom(som_grid_size[0], som_grid_size[1], input_size, sigma=sigma,
learning_rate=learning_rate)

# Training the SOM


som.train_random(X, epochs, verbose=True)

# Visualize the SOM


plt.figure(figsize=(8, 6))
plt.pcolor(som.distance_map().T, cmap='bone_r') # plot the distance map as
background
plt.colorbar()

# Plot the data points on the SOM


for i, (x, label) in enumerate(zip(X, y)):
w = som.winner(x) # getting the winner
plt.text(w[0] + 0.5, w[1] + 0.5, str(label), color=plt.cm.rainbow(label /
2.0), fontdict={'weight': 'bold', 'size': 9})

plt.show()
Output :-
Practical – 12

Ques. Write a program in python for Empirical Comparison of different


Supervised learning techniques.

Code :-
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score, classification_report
from sklearn.model_selection import cross_val_score

# Load a sample dataset (Iris dataset in this case)


iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split the dataset into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)

# Standardize features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Create classifiers
svm_classifier = SVC(kernel='linear', C=1)
rf_classifier = RandomForestClassifier(n_estimators=100, random_state=42)
knn_classifier = KNeighborsClassifier(n_neighbors=3)

# Train classifiers
svm_classifier.fit(X_train, y_train)
rf_classifier.fit(X_train, y_train)
knn_classifier.fit(X_train, y_train)

# Predictions
svm_pred = svm_classifier.predict(X_test)
rf_pred = rf_classifier.predict(X_test)
knn_pred = knn_classifier.predict(X_test)
# Evaluate performance
print("Support Vector Machine:")
print(f"Accuracy: {accuracy_score(y_test, svm_pred)}")
print("Classification Report:")
print(classification_report(y_test, svm_pred))

print("\nRandom Forest:")
print(f"Accuracy: {accuracy_score(y_test, rf_pred)}")
print("Classification Report:")
print(classification_report(y_test, rf_pred))

print("\nk-Nearest Neighbors:")
print(f"Accuracy: {accuracy_score(y_test, knn_pred)}")
print("Classification Report:")
print(classification_report(y_test, knn_pred))

# Cross-validation for additional comparison


svm_scores = cross_val_score(svm_classifier, X, y, cv=5)
rf_scores = cross_val_score(rf_classifier, X, y, cv=5)
knn_scores = cross_val_score(knn_classifier, X, y, cv=5)

print("\nCross-validation Scores:")
print("Support Vector Machine:", np.mean(svm_scores))
print("Random Forest:", np.mean(rf_scores))
print("k-Nearest Neighbors:", np.mean(knn_scores))
Output :-
Practical – 13

Ques. Write a program in python for Empirical Comparison of different


Unsupervised learning techniques.

Code :-
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans, AgglomerativeClustering
from minisom import MiniSom
from sklearn.metrics import silhouette_score

# Generate a synthetic dataset


X, _ = make_blobs(n_samples=300, centers=4, random_state=42, cluster_std=1.0)

# Standardize the features


X_std = (X - X.mean(axis=0)) / X.std(axis=0)

# Apply K-means clustering


kmeans = KMeans(n_clusters=4, random_state=42)
kmeans_labels = kmeans.fit_predict(X_std)

# Apply Hierarchical clustering


agg_clustering = AgglomerativeClustering(n_clusters=4)
agg_labels = agg_clustering.fit_predict(X_std)

# Apply Kohonen Self-Organizing Maps (SOM)


som = MiniSom(10, 10, X_std.shape[1], sigma=1.0, learning_rate=0.5)
som.train_random(X_std, 100)
som_labels = np.array([som.winner(x) for x in X_std]).T[0]

# Evaluate clustering quality using silhouette score


kmeans_silhouette = silhouette_score(X_std, kmeans_labels)
agg_silhouette = silhouette_score(X_std, agg_labels)
som_silhouette = silhouette_score(X_std, som_labels)

# Print silhouette scores


print(f"K-means Silhouette Score: {kmeans_silhouette}")
print(f"Hierarchical Silhouette Score: {agg_silhouette}")
print(f"SOM Silhouette Score: {som_silhouette}")

# Visualize the clustering results


plt.figure(figsize=(15, 5))
plt.subplot(1, 3, 1)
plt.scatter(X_std[:, 0], X_std[:, 1], c=kmeans_labels, cmap='viridis')
plt.title('K-means Clustering')

plt.subplot(1, 3, 2)
plt.scatter(X_std[:, 0], X_std[:, 1], c=agg_labels, cmap='viridis')
plt.title('Hierarchical Clustering')

plt.subplot(1, 3, 3)
plt.scatter(X_std[:, 0], X_std[:, 1], c=som_labels, cmap='viridis')
plt.title('SOM Clustering')

plt.show()

Output :-

You might also like