0% found this document useful (0 votes)

43 views7 pages

Unit2 ML Programs

The document discusses implementing various machine learning algorithms from scratch in Python, including Euclidean distance and Cosine similarity metrics, K-Nearest Neighbors regression, logistic regression, support vector machines, Naive Bayes classification, and comparing Naive Bayes to other classifiers. Code examples are provided to demonstrate applying each algorithm to sample datasets and evaluating performance.

Uploaded by

diroja5648

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views7 pages

Unit2 ML Programs

Uploaded by

diroja5648

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

unit2

December 25, 2023

3)Implement the Euclidean distance and Cosine similarity metrics from scratch in Python and
apply them to compare two vectors or data points.
[ ]: import math

# Function to calculate Euclidean distance

def euclidean_distance(vector1, vector2):
if len(vector1) != len(vector2):
raise ValueError("Both vectors must have the same dimensions.")
return math.sqrt(sum((x - y) ** 2 for x, y in zip(vector1, vector2))))

# Function to calculate Cosine similarity

def cosine_similarity(vector1, vector2):
if len(vector1) != len(vector2):
raise ValueError("Both vectors must have the same dimensions.")
dot_product = sum(x * y for x, y in zip(vector1, vector2))
magnitude_vector1 = math.sqrt(sum(x ** 2 for x in vector1))
magnitude_vector2 = math.sqrt(sum(x ** 2 for x in vector2))
return dot_product / (magnitude_vector1 * magnitude_vector2)

# Define two vectors

vector1 = [3, 1, 4, 1, 5]
vector2 = [2, 1, 2, 2, 3]

# Calculate Euclidean distance between vector1 and vector2

print("Euclidean distance: ", euclidean_distance(vector1, vector2))

# Calculate Cosine similarity between vector1 and vector2

print("Cosine similarity: ", cosine_similarity(vector1, vector2))

7)Implement the KNN algorithm in Python and apply it to a dataset to make predictions for a
new data point.
[ ]: import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsRegressor
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt

1
# Generate a synthetic dataset
np.random.seed(42)
X = np.sort(5 * np.random.rand(100, 1), axis=0)
y = np.sin(X).ravel()

# Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,␣
↪random_state=42)

# Visualize the dataset

plt.scatter(X, y, color='blue', label='Original Data')
plt.title('Synthetic Dataset for KNN Regression')
plt.xlabel('X')
plt.ylabel('y')
plt.legend()
plt.show()

# Implement KNN Regression

knn_regressor = KNeighborsRegressor(n_neighbors=3) # You can adjust the number␣
↪of neighbors (k)

knn_regressor.fit(X_train, y_train)

# Make predictions for a new data point

new_data_point = np.array([[2.5]]) # Replace with your own data point
predicted_value = knn_regressor.predict(new_data_point)

# Visualize the regression line

X_range = np.linspace(0, 5, 100).reshape(-1, 1)
y_pred_range = knn_regressor.predict(X_range)

plt.scatter(X, y, color='blue', label='Original Data')

plt.plot(X_range, y_pred_range, color='red', label='KNN Regression Line')
plt.scatter(new_data_point, predicted_value, color='green', label='New Data␣
↪Point Prediction', marker='x', s=100)

plt.title('KNN Regression')
plt.xlabel('X')
plt.ylabel('y')
plt.legend()
plt.show()

# Evaluate the model on the test set

y_pred_test = knn_regressor.predict(X_test)
mse = mean_squared_error(y_test, y_pred_test)
print(f'Mean Squared Error on Test Set: {mse:.2f}')

9)Train a logistic regression model on a binary classification dataset and analyze the importance

2
of each feature using their corresponding coefficients.
[ ]: import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

# Load the Iris dataset

iris = load_iris()
X = iris.data
y = iris.target

# Considering two classes for binary classification

X_binary = X[y != 2]
y_binary = y[y != 2]

# Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X_binary, y_binary,␣
↪test_size=0.2, random_state=42)

# Train a logistic regression model

logreg_model = LogisticRegression()
logreg_model.fit(X_train, y_train)

# Make predictions on the test set

y_pred = logreg_model.predict(X_test)

# Evaluate the model

accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy on Test Set: {accuracy:.2f}')

# Analyze feature importance using coefficients

feature_importance = pd.DataFrame({
'Feature': iris.feature_names,
'Coefficient': logreg_model.coef_[0]
})

# Display the feature importance

print('\nFeature Importance:')
print(feature_importance)

11)Implement a linear SVM classifier using Python’s scikit-learn library for a binary classification
problem. Visualize the decision boundary and support vectors.
[ ]: import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets

3
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC

# Load a synthetic dataset for binary classification

X, y = datasets.make_classification(n_samples=100, n_features=2, n_classes=2,␣
↪n_clusters_per_class=1, random_state=42)

# Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,␣
↪random_state=42)

# Train a linear SVM classifier

svm_classifier = SVC(kernel='linear')
svm_classifier.fit(X_train, y_train)

# Visualize the decision boundary and support vectors

plt.scatter(X_train[:, 0], X_train[:, 1], c=y_train, cmap='viridis',␣
↪marker='o', label='Training Data')

plt.scatter(svm_classifier.support_vectors_[:, 0], svm_classifier.

↪support_vectors_[:, 1],

s=100, facecolors='none', edgecolors='k', marker='o',␣

↪label='Support Vectors')

plt.xlabel('Feature 1')
plt.ylabel('Feature 2')

# Create a meshgrid to plot the decision boundary

xx, yy = np.meshgrid(np.linspace(X[:, 0].min(), X[:, 0].max(), 100),
np.linspace(X[:, 1].min(), X[:, 1].max(), 100))
Z = svm_classifier.decision_function(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

# Plot decision boundary

plt.contour(xx, yy, Z, colors='k', levels=[-1, 0, 1], alpha=0.5,␣
↪linestyles=['--', '-', '--'])

plt.title('Linear SVM Classifier with Decision Boundary and Support Vectors')

plt.legend()
plt.show()

13)Build a Naive Bayes classifier to classify text documents into different categories. Preprocess
the text data and use the Laplace smoothing technique to handle unseen words.
[ ]: import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, classification_report

4
# Sample text data for illustration
data = {'text': ["This is a positive document.",
"Negative sentiment detected in this text.",
"The sentiment in this document is positive.",
"This is another positive example."],
'label': ['Positive', 'Negative', 'Positive', 'Positive']}

df = pd.DataFrame(data)

# Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(df['text'], df['label'],␣
↪test_size=0.2, random_state=42)

# Preprocess the text data using CountVectorizer

vectorizer = CountVectorizer()
X_train_vectorized = vectorizer.fit_transform(X_train)
X_test_vectorized = vectorizer.transform(X_test)

# Build a Naive Bayes classifier with Laplace smoothing

naive_bayes_classifier = MultinomialNB(alpha=1.0) # Laplace smoothing␣
↪parameter (alpha=1.0 for add-one smoothing)

naive_bayes_classifier.fit(X_train_vectorized, y_train)

# Make predictions on the test set

y_pred = naive_bayes_classifier.predict(X_test_vectorized)

# Evaluate the classifier

accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.2f}\n')

print('Classification Report:')
print(classification_report(y_test, y_pred))

# Test with a new document

new_document = ["This document is very positive and contains positive words."]
new_document_vectorized = vectorizer.transform(new_document)
predicted_category = naive_bayes_classifier.predict(new_document_vectorized)
print(f'\nPredicted Category for the New Document: {predicted_category[0]}')

14)Compare the performance of the Naive Bayes algorithm with other classification algorithms on
a given dataset.
[ ]: import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

5
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score, classification_report

# Load the Iris dataset

iris = load_iris()
X = iris.data
y = iris.target

# Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,␣
↪random_state=42)

# Standardize the features

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Define classifiers
classifiers = {
'Naive Bayes': GaussianNB(),
'Random Forest': RandomForestClassifier(random_state=42),
'Support Vector Machine': SVC(kernel='linear', random_state=42),
'K-Nearest Neighbors': KNeighborsClassifier(),
}

# Train and evaluate each classifier

results = []
for clf_name, clf in classifiers.items():
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
classification_rep = classification_report(y_test, y_pred,␣
↪target_names=iris.target_names, output_dict=True)

results.append({'Classifier': clf_name, 'Accuracy': accuracy,␣

↪'Classification Report': classification_rep})

# Display the results

df_results = pd.DataFrame(results)
print(df_results)

15) Build a Random Forest classifier using scikit-learn and apply it to a dataset

6
[ ]: import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report

# Load the Iris dataset

iris = load_iris()
X = iris.data
y = iris.target

# Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,␣
↪random_state=42)

# Build a Random Forest classifier

random_forest_classifier = RandomForestClassifier(n_estimators=100,␣
↪random_state=42)

random_forest_classifier.fit(X_train, y_train)

# Make predictions on the test set

y_pred = random_forest_classifier.predict(X_test)

# Evaluate the classifier

accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy:.2f}\n')

print('Classification Report:')
print(classification_report(y_test, y_pred, target_names=iris.target_names))

LAB-4 Report
No ratings yet
LAB-4 Report
21 pages
Python For Data Science IA 1 Programs
No ratings yet
Python For Data Science IA 1 Programs
14 pages
16BCB0126 VL2018195002535 Pe003
No ratings yet
16BCB0126 VL2018195002535 Pe003
40 pages
Python For Data Science IA 1 Programs
No ratings yet
Python For Data Science IA 1 Programs
14 pages
Linearregression SVM
No ratings yet
Linearregression SVM
3 pages
Machine Learning Evaluation Guide
100% (1)
Machine Learning Evaluation Guide
504 pages
Aml Lab
No ratings yet
Aml Lab
6 pages
Shobit Sharma (2124399) ML Lab File PDF
No ratings yet
Shobit Sharma (2124399) ML Lab File PDF
19 pages
ML Algorithms
100% (1)
ML Algorithms
1 page
Machine Learning LAB
No ratings yet
Machine Learning LAB
20 pages
Apriori Algorithm
No ratings yet
Apriori Algorithm
12 pages
cp4252 Machine Learning Lab Manual
No ratings yet
cp4252 Machine Learning Lab Manual
21 pages
Aam Codes
No ratings yet
Aam Codes
8 pages
1
No ratings yet
1
13 pages
ML Manual With Outputs
No ratings yet
ML Manual With Outputs
30 pages
Programs Lab Bca
No ratings yet
Programs Lab Bca
16 pages
ML Minimized Programs
No ratings yet
ML Minimized Programs
9 pages
DM ML Practical
No ratings yet
DM ML Practical
13 pages
Data Preprocessing
No ratings yet
Data Preprocessing
9 pages
Python ML Algorithms Guide
No ratings yet
Python ML Algorithms Guide
7 pages
ML Lab Record8to15
No ratings yet
ML Lab Record8to15
23 pages
Big Data Practical
No ratings yet
Big Data Practical
20 pages
Advance AI and ML LAB
No ratings yet
Advance AI and ML LAB
16 pages
ML Practical Kiranjot 6-10
No ratings yet
ML Practical Kiranjot 6-10
10 pages
ML Lab Manual
No ratings yet
ML Lab Manual
12 pages
ML Journal External
No ratings yet
ML Journal External
14 pages
Classification Review
No ratings yet
Classification Review
8 pages
V
No ratings yet
V
8 pages
Data Mining Practicals
No ratings yet
Data Mining Practicals
22 pages
ML Lab6
No ratings yet
ML Lab6
4 pages
AI Lab M.Tech
No ratings yet
AI Lab M.Tech
29 pages
Case Study - Classifier
No ratings yet
Case Study - Classifier
5 pages
Aiml Practical
No ratings yet
Aiml Practical
17 pages
KNN Final
No ratings yet
KNN Final
4 pages
MLLab Manual
No ratings yet
MLLab Manual
24 pages
Machine Learnin
100% (2)
Machine Learnin
23 pages
Shubham Pract 6 - Merged
No ratings yet
Shubham Pract 6 - Merged
12 pages
AIML Programs
No ratings yet
AIML Programs
22 pages
1st PGM
No ratings yet
1st PGM
10 pages
Assignment 1
No ratings yet
Assignment 1
17 pages
ML II Lab
No ratings yet
ML II Lab
5 pages
ML Brefing
No ratings yet
ML Brefing
28 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
9 pages
Setup: This Notebook Contains All The Sample Code and Solutions To The Exercises in Chapter 3
No ratings yet
Setup: This Notebook Contains All The Sample Code and Solutions To The Exercises in Chapter 3
30 pages
Machine Learning Programs
No ratings yet
Machine Learning Programs
10 pages
KRAI Practical
No ratings yet
KRAI Practical
14 pages
ML Cheatsheet
No ratings yet
ML Cheatsheet
4 pages
Machine Learning Algorithms Guide
No ratings yet
Machine Learning Algorithms Guide
34 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
11 pages
Final ML Programs 075005
No ratings yet
Final ML Programs 075005
15 pages
Data Analytics
No ratings yet
Data Analytics
10 pages
KNN vs SVM: A Python Implementation
No ratings yet
KNN vs SVM: A Python Implementation
6 pages
Wa0003
No ratings yet
Wa0003
16 pages
Mlda - Lab
No ratings yet
Mlda - Lab
35 pages
ML Prac1-10
No ratings yet
ML Prac1-10
32 pages
Prathamesh KRAI
No ratings yet
Prathamesh KRAI
38 pages
ML Experiment WithDataset
No ratings yet
ML Experiment WithDataset
23 pages
Static-GK-Capsule ASO Odisha 2018
No ratings yet
Static-GK-Capsule ASO Odisha 2018
66 pages
Annual Report SMGR
No ratings yet
Annual Report SMGR
425 pages
Food As Medicine Everyda PDF
75% (8)
Food As Medicine Everyda PDF
267 pages
Tantric Buddhism in Khmer History
No ratings yet
Tantric Buddhism in Khmer History
17 pages
About The ISO 8573 1 Standard
100% (1)
About The ISO 8573 1 Standard
14 pages
Emergency Response Procedure
100% (1)
Emergency Response Procedure
6 pages
The Gentle Art of Preserving Pickling Smoking Freezing Drying Curing Fermenting Bottling Canning and Making Jams Jellies and Cordials 1st Edition Katie Caldesi Download
100% (1)
The Gentle Art of Preserving Pickling Smoking Freezing Drying Curing Fermenting Bottling Canning and Making Jams Jellies and Cordials 1st Edition Katie Caldesi Download
54 pages
Learning Activity4.1 (Science Grade 8) : Name: Grade/Score: Year and Section: Date
100% (1)
Learning Activity4.1 (Science Grade 8) : Name: Grade/Score: Year and Section: Date
2 pages
Kanon Green Binder Rev 2
100% (1)
Kanon Green Binder Rev 2
202 pages
Top Surgery Guide Supply List PDF
No ratings yet
Top Surgery Guide Supply List PDF
2 pages
Manual English Volume 2 of 3 (Rev.01)
No ratings yet
Manual English Volume 2 of 3 (Rev.01)
168 pages
Engineering Internship Insights
No ratings yet
Engineering Internship Insights
57 pages
Half Yearly - Class I (EVS)
100% (1)
Half Yearly - Class I (EVS)
4 pages
Radiation Shielding Principles
No ratings yet
Radiation Shielding Principles
212 pages
DriveDxReport - APPLE SSD SM0256G - 2023-03-23 - 00-40-37-603
No ratings yet
DriveDxReport - APPLE SSD SM0256G - 2023-03-23 - 00-40-37-603
4 pages
nai3色色个人法典（2024 1 27版，一般所长整理）
No ratings yet
nai3色色个人法典（2024 1 27版，一般所长整理）
29 pages
Printingmethodsstylesetc 160507081424
No ratings yet
Printingmethodsstylesetc 160507081424
32 pages
Of Scientific Research Students and Lecture: Preparing A Paper Using Microsoft Word For Publication in Journal
No ratings yet
Of Scientific Research Students and Lecture: Preparing A Paper Using Microsoft Word For Publication in Journal
5 pages
Artificial - Intelegence-1 - Autosaved
No ratings yet
Artificial - Intelegence-1 - Autosaved
155 pages
Empire dw1040tp Dishwasher
No ratings yet
Empire dw1040tp Dishwasher
20 pages
108 Names of Mata Amritanandamayi
100% (1)
108 Names of Mata Amritanandamayi
10 pages
Atoms and Molecules Class 9 Notes CBSE Science Chapter 3 (PDF)
100% (1)
Atoms and Molecules Class 9 Notes CBSE Science Chapter 3 (PDF)
4 pages
Articulo Agro PDF
No ratings yet
Articulo Agro PDF
6 pages
FOP 21 September
No ratings yet
FOP 21 September
40 pages
Product Specifications Product Specifications: Ldf4Rn LDF4RN - 50A 50A
No ratings yet
Product Specifications Product Specifications: Ldf4Rn LDF4RN - 50A 50A
3 pages
AC Service Appointment and Maintenance Form
No ratings yet
AC Service Appointment and Maintenance Form
11 pages
4ch1 1cr Rms 20250821
No ratings yet
4ch1 1cr Rms 20250821
15 pages
Pharmacy Exam Prep: Organic Chemistry
No ratings yet
Pharmacy Exam Prep: Organic Chemistry
6 pages
Bhava Chandrika Missing Links of Hindu Astrology
No ratings yet
Bhava Chandrika Missing Links of Hindu Astrology
32 pages
Magna-505-Display Infor
No ratings yet
Magna-505-Display Infor
2 pages