0% found this document useful (0 votes)

58 views3 pages

K-Means Clustering From Scratch

ML lab program code K-NN and K means etc

Uploaded by

tabassumtayiba786

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

58 views3 pages

K-Means Clustering From Scratch

ML lab program code K-NN and K means etc

Uploaded by

tabassumtayiba786

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

K-Means Clustering from Scratch

Algorithm Explanation

1. Initialization: Choose k random centroids.

2. Assignment: Assign each data point to the nearest centroid based on Euclidean distance.
3. Update: Recompute the centroids as the mean of the points assigned to each cluster.
4. Repeat: Continue the assign-update steps until convergence (no change in centroids).

python
Copy code
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

def initialize_centroids(X, k):

"""Randomly initialize centroids"""
indices = np.random.choice(X.shape[0], k, replace=False)
return X[indices]

def assign_clusters(X, centroids):

"""Assign data points to the closest centroid"""
distances = np.linalg.norm(X[:, np.newaxis] - centroids, axis=2)
return np.argmin(distances, axis=1)

def update_centroids(X, labels, k):

"""Recalculate the centroids"""
new_centroids = np.array([X[labels == i].mean(axis=0) for i in
range(k)])
return new_centroids

def kmeans(X, k, max_iters=100):

"""K-Means clustering"""
centroids = initialize_centroids(X, k)
for _ in range(max_iters):
labels = assign_clusters(X, centroids)
new_centroids = update_centroids(X, labels, k)
if np.all(centroids == new_centroids):
break
centroids = new_centroids
return labels, centroids

# Load CSV file

data = pd.read_csv('data.csv')
X = data[['Feature1', 'Feature2']].values

# Apply K-Means from scratch

k = 3 # Number of clusters
labels, centroids = kmeans(X, k)

# Plotting the clusters

plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='viridis')
plt.scatter(centroids[:, 0], centroids[:, 1], s=300, c='red', marker='X')
plt.show()

K-Means Using sklearn

python
Copy code
from sklearn.cluster import KMeans

# Load CSV file

data = pd.read_csv('data.csv')
X = data[['Feature1', 'Feature2']].values

# Apply K-Means using sklearn

kmeans = KMeans(n_clusters=3)
labels = kmeans.fit_predict(X)

# Plot the clusters

plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='viridis')
plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1],
s=300, c='red', marker='X')
plt.show()

K-Nearest Neighbors (KNN) from Scratch

Algorithm Explanation

1. Distance Calculation: For a given test point, calculate the Euclidean distance to all training
points.
2. Neighbors Selection: Select the k nearest neighbors.
3. Majority Voting: Assign the class of the majority of these neighbors to the test point.

python
Copy code
import numpy as np
import pandas as pd
from collections import Counter

def euclidean_distance(x1, x2):

"""Calculate Euclidean distance"""
return np.sqrt(np.sum((x1 - x2) ** 2))

def knn_predict(X_train, y_train, X_test, k=3):

"""KNN classification"""
predictions = []
for test_point in X_test:
distances = [euclidean_distance(test_point, x) for x in X_train]
k_indices = np.argsort(distances)[:k]
k_nearest_labels = [y_train[i] for i in k_indices]
most_common = Counter(k_nearest_labels).most_common(1)[0][0]
predictions.append(most_common)
return np.array(predictions)

# Load CSV file

data = pd.read_csv('data.csv')
X = data[['Feature1', 'Feature2']].values
y = data['Label'].values

# Split data (for simplicity, let's take a small portion for test)
X_train, X_test = X[:80], X[80:]
y_train, y_test = y[:80], y[80:]

# Apply KNN from scratch

k = 3
predictions = knn_predict(X_train, y_train, X_test, k)

# Compare predictions with actual test labels

print('Predicted labels:', predictions)
print('Actual labels:', y_test)

KNN Using sklearn

python
Copy code
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load CSV file

data = pd.read_csv('data.csv')
X = data[['Feature1', 'Feature2']].values
y = data['Label'].values

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Apply KNN using sklearn

knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, y_train)
predictions = knn.predict(X_test)

# Check accuracy
accuracy = accuracy_score(y_test, predictions)
print('Accuracy:', accuracy)

Explanation of Code:

1. K-Means from Scratch:

o We initialize centroids randomly and iteratively assign clusters and update centroids
until convergence.
o We visualize the clusters by plotting points and centroids.
2. KNN from Scratch:
o The Euclidean distance is calculated for each test point against all training points.
o The k nearest points are selected, and the majority class label is predicted.
3. Using sklearn:
o Both K-Means and KNN implementations are simplified by using the sklearn
library, allowing for faster execution and easier integration with different datasets.

Make sure to adjust the CSV file path and the feature columns based on your dataset.

KMEANS
No ratings yet
KMEANS
9 pages
K Nearest Neighbors
No ratings yet
K Nearest Neighbors
5 pages
ML Minors Exp7
No ratings yet
ML Minors Exp7
6 pages
Aml - Lab (1-6)
No ratings yet
Aml - Lab (1-6)
15 pages
ML Exp5 C36
No ratings yet
ML Exp5 C36
18 pages
KNN Algorithm: Classification Example
No ratings yet
KNN Algorithm: Classification Example
2 pages
Introduction To K-Nearest Neighbors: Simplified (With Implementation in Python)
100% (1)
Introduction To K-Nearest Neighbors: Simplified (With Implementation in Python)
125 pages
Assignment No 2 AI
No ratings yet
Assignment No 2 AI
4 pages
Part A 3. KNN Classification
No ratings yet
Part A 3. KNN Classification
35 pages
ML Notes
100% (2)
ML Notes
125 pages
Updated K-Nearest Neighbors in Machine Learning
No ratings yet
Updated K-Nearest Neighbors in Machine Learning
11 pages
AIML Lab 10
No ratings yet
AIML Lab 10
4 pages
DS - ML - 7 - 60019210046 1
No ratings yet
DS - ML - 7 - 60019210046 1
6 pages
ML Lab2 PGM
No ratings yet
ML Lab2 PGM
3 pages
DSM 1
No ratings yet
DSM 1
6 pages
KNN for Cancer Classification
No ratings yet
KNN for Cancer Classification
6 pages
Ai Lab Programs
No ratings yet
Ai Lab Programs
5 pages
Rahul Raj - Ipynb - Colab
No ratings yet
Rahul Raj - Ipynb - Colab
50 pages
K-Means Clustering Guide
No ratings yet
K-Means Clustering Guide
26 pages
Worksheet - 2.3 20BCS7611
No ratings yet
Worksheet - 2.3 20BCS7611
6 pages
V
No ratings yet
V
8 pages
Artificial Intelligence Lab 10
No ratings yet
Artificial Intelligence Lab 10
8 pages
Python For Data Science IA 1 Programs
No ratings yet
Python For Data Science IA 1 Programs
14 pages
K-Nearest Neighbor On Python Ken Ocuma
100% (2)
K-Nearest Neighbor On Python Ken Ocuma
9 pages
KNN Clearly Explained 1696688332
No ratings yet
KNN Clearly Explained 1696688332
7 pages
ML Lab Week 7
No ratings yet
ML Lab Week 7
4 pages
2.3 Aiml Rishit
No ratings yet
2.3 Aiml Rishit
7 pages
Worksheet - 2.3 20BCS7490
No ratings yet
Worksheet - 2.3 20BCS7490
6 pages
Implementing KNN Algorithm On The Iris Dataset
No ratings yet
Implementing KNN Algorithm On The Iris Dataset
7 pages
Document 10
No ratings yet
Document 10
3 pages
Lab 8
No ratings yet
Lab 8
7 pages
Lab 10 - Manual and Assignment On KNN
No ratings yet
Lab 10 - Manual and Assignment On KNN
3 pages
Mla 7th
No ratings yet
Mla 7th
2 pages
Assignment 4
No ratings yet
Assignment 4
9 pages
KNN - Predictive Analysis
No ratings yet
KNN - Predictive Analysis
6 pages
Iris Dataset Analysis with KNN & K-Means
No ratings yet
Iris Dataset Analysis with KNN & K-Means
6 pages
Experiment No 7
No ratings yet
Experiment No 7
4 pages
KNN & Decision Tree Basics
No ratings yet
KNN & Decision Tree Basics
9 pages
Clustering & KNN Algorithm Guide
No ratings yet
Clustering & KNN Algorithm Guide
15 pages
Drawback of Standard K-Means Algorithm
No ratings yet
Drawback of Standard K-Means Algorithm
5 pages
KNN Algorithm for Car Classification
No ratings yet
KNN Algorithm for Car Classification
9 pages
B-56 Sanket Jambhulkar MLA-7
No ratings yet
B-56 Sanket Jambhulkar MLA-7
9 pages
MLT Unit 3 Notes
No ratings yet
MLT Unit 3 Notes
19 pages
KNN Algorithm Guide with Python
No ratings yet
KNN Algorithm Guide with Python
13 pages
4K-Nearest Neighbor
No ratings yet
4K-Nearest Neighbor
38 pages
K-Nearest Neighbors Algorithm
No ratings yet
K-Nearest Neighbors Algorithm
7 pages
K-Nearest Neighbors (KNN) Algorithm: Dr. Nagaraju K, CSE
No ratings yet
K-Nearest Neighbors (KNN) Algorithm: Dr. Nagaraju K, CSE
24 pages
KNN Classifier
No ratings yet
KNN Classifier
5 pages
ADL LAB Manual
No ratings yet
ADL LAB Manual
27 pages
KNN Classification Lab Guide
No ratings yet
KNN Classification Lab Guide
4 pages
Python For Data Science IA 1 Programs
No ratings yet
Python For Data Science IA 1 Programs
14 pages
Experiment No 7 ML
No ratings yet
Experiment No 7 ML
4 pages
KNN Algorithm Guide for Students
No ratings yet
KNN Algorithm Guide for Students
7 pages
Unit 2
No ratings yet
Unit 2
30 pages
Minor Assignment 4
No ratings yet
Minor Assignment 4
17 pages
Wa0003
No ratings yet
Wa0003
16 pages
Program 4
No ratings yet
Program 4
3 pages
Data Science and Machine Learning Practicals
No ratings yet
Data Science and Machine Learning Practicals
8 pages
04 KNN Implementation
No ratings yet
04 KNN Implementation
7 pages
Uninformed Search Algorithms
No ratings yet
Uninformed Search Algorithms
63 pages
Implementation Algorithms For Graphics Primitives and Attributes
No ratings yet
Implementation Algorithms For Graphics Primitives and Attributes
27 pages
K Means Final
No ratings yet
K Means Final
10 pages
Yanson, Charles Stephen M - 3BSCE - C
No ratings yet
Yanson, Charles Stephen M - 3BSCE - C
14 pages
Algorithm Analysis Module 3 Important Topics
No ratings yet
Algorithm Analysis Module 3 Important Topics
51 pages
Lesson 4.2 Intermediate and Extreme Value Theorem
100% (2)
Lesson 4.2 Intermediate and Extreme Value Theorem
24 pages
Chapter-2-Polynomials Extra Practice
No ratings yet
Chapter-2-Polynomials Extra Practice
2 pages
Employee Scheduling Optimization
No ratings yet
Employee Scheduling Optimization
12 pages
BSC Bed 4 2022 Que Papers
No ratings yet
BSC Bed 4 2022 Que Papers
12 pages
Q1 - MATH 8 - LESSON 1 FACTORING by GCMF
No ratings yet
Q1 - MATH 8 - LESSON 1 FACTORING by GCMF
19 pages
Assignment PDF
No ratings yet
Assignment PDF
4 pages
Lagrange Interpolation
No ratings yet
Lagrange Interpolation
8 pages
K Mean Cluster Analysis
No ratings yet
K Mean Cluster Analysis
16 pages
Mth401ca Iii
No ratings yet
Mth401ca Iii
5 pages
DL Question Bank
No ratings yet
DL Question Bank
5 pages
Operation Research Practice Set
No ratings yet
Operation Research Practice Set
10 pages
DSA Foundation Syllabus & Resources
No ratings yet
DSA Foundation Syllabus & Resources
15 pages
FACTORING - Common Factors and Difference of Two Squares
No ratings yet
FACTORING - Common Factors and Difference of Two Squares
16 pages
Linear Programming Problem: Sensitivity Analysis
No ratings yet
Linear Programming Problem: Sensitivity Analysis
67 pages
Agglomerative Clustering Guide
No ratings yet
Agglomerative Clustering Guide
3 pages
MLready
No ratings yet
MLready
3 pages
Optimization Course Guide
No ratings yet
Optimization Course Guide
3 pages
DL
No ratings yet
DL
2 pages
cs231n Training Neural Networks II
No ratings yet
cs231n Training Neural Networks II
99 pages
Introduction to Operational Research and Linear Programming
No ratings yet
Introduction to Operational Research and Linear Programming
77 pages
2.3 Newton's Method and Its Extension
No ratings yet
2.3 Newton's Method and Its Extension
12 pages
A Tutorial On Compressive Sensing
No ratings yet
A Tutorial On Compressive Sensing
83 pages
Method of Elimination 1
No ratings yet
Method of Elimination 1
5 pages
Bisection Method
No ratings yet
Bisection Method
20 pages