Hierarchical Clustering Technique
What is Hierarchical Clustering?
Hierarchical clustering is a clustering technique that builds a hierarchy (tree structure) of clusters.
Unlike K-means, it doesnt require specifying the number of clusters in advance. It is mainly used in:
- Customer segmentation
- Image analysis
- Bioinformatics (gene clustering)
Principle Behind Hierarchical Clustering
The key idea is to group similar data points into clusters by using a distance metric (e.g., Euclidean
distance) and a linkage criterion (e.g., single, complete, average).
It produces a dendrogram a tree-like diagram showing how data points are merged or split.
Two Main Approaches
1. Bottom-Up (Agglomerative) Clustering
Working Principle:
- Start with each data point as a separate cluster.
- At each step, merge the two closest clusters.
- Continue until all points belong to one cluster (or a stopping condition is met).
Linkage Criteria:
- Single linkage: Min distance between clusters
- Complete linkage: Max distance between clusters
- Average linkage: Average distance
Example:
Given 4 data points: A, B, C, D
| Pair | Distance |
|------|----------|
| A-B | 2 |
| A-C | 6 |
| A-D | 10 |
| B-C | 5 |
| B-D | 9 |
| C-D | 4 |
Steps:
1. Merge A and B Cluster1 (A,B)
2. Next closest: C and D Cluster2 (C,D)
3. Merge Cluster1 and Cluster2 Final Cluster
Result: A dendrogram showing the merging sequence.
2. Top-Down (Divisive) Clustering
Working Principle:
- Start with all data points in one big cluster.
- At each step, split the least similar points.
- Continue splitting until each point is its own cluster or a stopping criterion is met.
Example:
Start with: Cluster {A, B, C, D}
1. Split into {A, B} and {C, D} based on dissimilarity
2. Further divide {A, B} into A and B, and {C, D} into C and D
3. Final result: four singleton clusters
Result: Reverse of agglomerative, the dendrogram is built from the top and splits downward.
Comparison: Bottom-Up vs Top-Down
| Feature | Bottom-Up (Agglomerative) | Top-Down (Divisive) |
|-----------------------|--------------------------------|-------------------------------|
| Initial Step | Start with individual points | Start with all points |
| Approach | Merge clusters | Split clusters |
| Complexity | O(n2 log n) | Higher, often exponential |
| Common Usage | More commonly used | Less common due to complexity |
| Output | Dendrogram | Dendrogram |
Visualization: Dendrogram
_________
| |
___|___ __|__
| | | |
A B C D
- The height of the lines represents the distance (or dissimilarity) at which merges/splits occur.
- You can cut the dendrogram at a certain height to decide the number of clusters.
Summary
- Hierarchical clustering builds nested clusters either bottom-up or top-down.
- No need to pre-define number of clusters.
- Useful when the hierarchical relationship between data is important.
- Visualized using dendrograms.