K-Means Clustering Example:
We are given 4 data points: A(2,3), B(3,3), C(8,6), and D(9,5). We want to cluster
them into 2 groups using the K-Means algorithm with K=2. We'll use Euclidean
distance to determine the closeness of points to centroids.
Step 1: Choose Initial Centroids
Let’s pick A(2,3) and C(8,6) as the initial centroids:
- Centroid 1 (Cluster 1): A(2,3)
- Centroid 2 (Cluster 2): C(8,6)
Step 2: Assign Points to the Nearest Centroid (Using Euclidean Distance)
Euclidean Distance Formula:
d = √[(x₂ − x₁)² + (y₂ − y₁)²]
Calculations:
- A(2,3) to A = 0
- A(2,3) to C = √[(8−2)² + (6−3)²] = √45 ≈ 6.71
- B(3,3) to A = √1 = 1.0
- B(3,3) to C = √34 ≈ 5.83
- C(8,6) to A = √45 ≈ 6.71
- C(8,6) to C = 0
- D(9,5) to A = √53 ≈ 7.28
- D(9,5) to C = √2 ≈ 1.41
Point Distance to Distance to Assigned Cluster
A(2,3) C(8,6)
A(2,3) 0 6.71 Cluster 1
B(3,3) 1.0 5.83 Cluster 1
C(8,6) 6.71 0 Cluster 2
D(9,5) 7.28 1.41 Cluster 2
Resulting Clusters:
- Cluster 1: A(2,3), B(3,3)
- Cluster 2: C(8,6), D(9,5)
Step 3: Recalculate Centroids
Compute the new centroids by taking the average of points in each cluster.
New Centroid 1 (A and B):
x = (2 + 3)/2 = 2.5
y = (3 + 3)/2 = 3.0 → (2.5, 3)
New Centroid 2 (C and D):
x = (8 + 9)/2 = 8.5
y = (6 + 5)/2 = 5.5 → (8.5, 5.5)
Step 4: Reassign Points Based on New Centroids
Recalculate distances to new centroids:
- A to (2.5, 3) = 0.5
- B to (2.5, 3) = 0.5
- C to (8.5, 5.5) ≈ 0.71
- D to (8.5, 5.5) ≈ 0.71
Point Distance to (2.5, Distance to (8.5, Assigned Cluster
3) 5.5)
A(2,3) 0.5 6.96 Cluster 1
B(3,3) 0.5 6.04 Cluster 1
C(8,6) 6.26 0.71 Cluster 2
D(9,5) 6.8 0.71 Cluster 2
Final Clusters
Cluster 1: A(2,3), B(3,3)
Cluster 2: C(8,6), D(9,5)
As the cluster assignments didn’t change after updating the centroids, the algorithm
stops here.