0% found this document useful (0 votes)

41 views43 pages

02 - Clustering

The document discusses unsupervised learning and clustering algorithms. It defines supervised vs. unsupervised learning, and describes clustering as a form of nonparametric unsupervised learning. The key components of clustering are discussed as proximity measures, criterion functions, and clustering algorithms like k-means and hierarchical clustering. K-means clustering is described in detail, including initializing cluster centers, computing cluster means, and iteratively reassigning samples until cluster membership stabilizes.

Uploaded by

عبد الحميد عمرو عبد الحميد فرغلى هلالى

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views43 pages

02 - Clustering

Uploaded by

عبد الحميد عمرو عبد الحميد فرغلى هلالى

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

CSE464 & CSE468 : Image

Processing and Pattern

Recognition

Clustering
by:
Hossam El Din Hassan Abd El Munim
‫حسام الدين حسن عبد المنعم‬
Computer & Systems Engineering Dept.,
Ain Shams University,
1 El-Sarayat Street, Abbassia, Cairo 11517

1
Today

▪ New Topic: Unsupervised Learning

▪ Supervised vs. unsupervised learning
▪ Unsupervised learning
▪ Nonparametric unsupervised learning = clustering
▪ Proximity Measures
▪ Criterion Functions
▪ Flat Clustering
▪ k-means
▪ Hierarchical Clustering
▪ Divisive
▪ Agglomerative
Supervised vs. Unsupervised Learning
▪ Up to now we considered supervised learning
scenario, where we are given
1. samples x1,…, xn
2. class labels for all samples x1,…, xn
▪ This is also called learning with teacher, since correct
answer (the true class) is provided

▪ In the next few lectures we consider

unsupervised learning scenario, where we are
only given
1. samples x1,…, xn
▪ This is also called learning without teacher, since
correct answer is not provided
▪ do not split data into training and test sets
Unsupervised Learning
a lot is
▪ Data is not labeled known
”easier”

1. Parametric Approach
▪ assume parametric distribution of data
▪ estimate parameters of this distribution
▪ much “harder” than supervised case
▪ NonParametric Approach
▪ group the data into clusters, each cluster (hopefully)
says something about categories (classes) present in
the data

little is
known
“harder”
Why Unsupervised Learning?
▪ Unsupervised learning is harder
▪ How do we know if results are meaningful? No answer
labels are available.
▪ Let the expert look at the results (external evaluation)
▪ Define an objective function on clustering (internal evaluation)
▪ We nevertheless need it because
1. Labeling large datasets is very costly (speech recognition)
▪ sometimes can label only a few examples by hand
2. May have no idea what/how many classes there are (data
mining)
3. May want to use clustering to gain some insight into the
structure of the data before designing a classifier
▪ Clustering as data description
Clustering
▪ Seek “natural” clusters in the data

▪ What is a good clustering?

▪ internal (within the cluster) distances should be small
▪ external (intra-cluster) should be large
▪ Clustering is a way to discover new
categories (classes)
What we Need for Clustering
1. Proximity measure, either
▪ similarity measure s(xi,xk): large if xi,xk are similar
▪ dissimilarity(or distance) measure d(xi,xk): small if xi,xk are similar

large d, small s large s, small d

2. Criterion function to evaluate a clustering

good clustering bad clustering

3. Algorithm to compute clustering

▪ For example, by optimizing the criterion function
How Many Clusters?

3 clusters or 2 clusters?

▪ Possible approaches
1. fix the number of clusters to k
2. find the best clustering according to the criterion
function (number of clusters may vary)
Proximity Measures
▪ good proximity measure is VERY application
dependent
▪ Clusters should be invariant under the transformations
“natural” to the problem
▪ For example for object recognition, should have
invariance to rotation

distance 0

▪ For character recognition, no invariance to rotation

9 6
Distance (dissimilarity) Measures
▪ Euclidean distanced
  Σx    x   
d xi , x j  i
k
j
k 2

k 1
▪ translation invariant

▪ Manhattan (city block) distance

 
d x i , x j  Σx ik   x jk 
d

k 1
▪ approximation to Euclidean distance,
cheaper to compute

▪ Chebyshev distance
 
d x i , x j  max | x ik   x jk  |
1k d

▪ approximation to Euclidean distance,

cheapest to compute
Similarity Measures
▪ Cosine similarity:

s xi , x j   x Ti x j
|| x i || || x j ||

▪ the smaller the angle, the larger the

similarity
▪ scale invariant measure
▪ popular in text retrieval

▪ Correlation coefficient
▪ popular in image processing

Σx    x x    x 
d
k k
i i

 
i i
s xi , x j  k 1

「d
  Σx 
1/ 2

|Σ xi  x i
d
k  2 k 
 x j 2|
]
j
k 1
k 1
Feature Scale
Simplest Clustering Algorithm
•Two Issues
1.How to measure similarity between samples?
2.How to evaluate partitioning?
•If distance is a good measure of dissimilarity, distance between
samples in same cluster must be smaller than distance between
samples in different clusters
•Two samples belong to the same cluster if distance between them is
less than a threshold d0.
Scaling Axis
Criterion Functions for Clustering
▪ Have samples x1,…,xn
▪ Suppose partitioned samples into c subsets D1,…,Dc
D3
D1
D2
▪ There are approximately cn/c! distinct partitions
▪ Can define a criterion function J(D1,…,Dc) which
measures the quality of a partitioning D1,…,Dc
▪ Then the clustering problem is a well defined
problem
▪ the optimal clustering is the partition which optimizes the
criterion function
SSE Criterion Function
▪ Let ni be the number of samples in Di, and define
the mean of samples in in Di
i 
1
ni
Σx
xDi

▪ Then the sum-of-squared errors criterion function (to

minimize) is:
JSSE  ΣΣ|| x  i ||
c
2

i 1 xDi

2
1

▪ Note that the number of clusters, c, is fixed

SSE Criterion Function
JSSE  ΣΣ|| x  i ||2
c

i 1 xDi

▪ SSE criterion appropriate when data forms compact

clouds that are relatively well separated

▪ SSE criterion favors equally sized clusters, and may

not be appropriate when “natural” groupings have
very different sizes
large JSSE small JSSE
K-means Clustering
▪ We now consider an example of iterative optimization algorithm for the special case of
JSSE objective function

JSSE  ΣΣ|| x  i ||2

i 1 xDi

▪ for a different objective function, we need a different optimization algorithm, of

course

▪ Fix number of clusters to k (c = k)

▪ k-means is probably the most famous clustering algorithm

▪ it has a smart way of moving from current partitioning to the next one
K-means Clustering k=3

1. Initialize x
▪ pick k cluster centers arbitrary
▪ assign each example to closest x x
center

2. compute sample
means for each cluster x
x x

3. reassign all samples to the x

closest mean x
x

4. if clusters changed at step 3, go to step 2

K-means Clustering (Means Derivation)
K-means Clustering
▪ In practice, k-means clustering performs usually
well

▪ It is very efficient

▪ Its solution can be used as a starting point for

other clustering algorithms

▪ Still 100’s of papers on variants and improvements

of k-means clustering every year
Clustering Criteria Again
23

• Scatter criteria
• Scatter matrices used in multiple discriminant analysis,
i.e., the within-scatter matrix SW and the between-
scatter matrix SB
ST = SB +SW
• Note:
• ST does not depend on partitioning
• In contrast, SB and SW depend on partitioning
• Two approaches:
• minimize the within-cluster
• maximize the between-cluster scatter

Pattern Classification, Chapter 10

The trace (sum of diagonal elements) is the

simplest scalar measure of the scatter matrix
c c
trSW    trSi     x  m i  Je
2

i 1 i 1 xDi

• proportional to the sum of the variances in the

coordinate directions
• This is the sum-of-squared-error criterion, Jsse.

Pattern Classification, Chapter 10

• As tr[ST] = tr[SW] + tr[SB] and tr[ST] is independent from

the partitioning, no new results can be derived by
minimizing tr[SB]

• However, seeking to minimize the within-cluster criterion

Jsse=tr[SW], is equivalent to maximise the between-cluster
criterion c
trS B    ni m i  m
2

i 1

where m is the total mean vector:

1 1 c
m   x   ni mi
n D n i 1

Pattern Classification, Chapter 10

Iterative optimization
• Clustering  discrete optimization problem
• Finite data set  finite number of partitions
• What is the cost of exhaustive search?
cn/c! For c clusters. Not a good idea

• Typically iterative optimization used:

• starting from a reasonable initial partition
• Redistribute samples to minimize criterion function.
•  guarantees local, not global, optimization.
Pattern Classification, Chapter 10
27

• consider an iterative procedure to minimize the

sum-of-squared-error criterion Je
c
Je   Ji Ji   xm
2
where i
i 1 xDi

where Ji is the effective error per cluster.

• Moving sample x̂ from cluster Di to Dj, changes

the errors in the 2 clusters by:
nj 2 ni
J j*  J j  xˆ  m j Ji*  Ji  xˆ  m i
2

n j 1 ni  1

Pattern Classification, Chapter 10

• Hence, the transfer is advantegeous if the

decrease in Ji is larger than the increase in Jj
ni nj 2
xˆ  m i  xˆ  m j
2

ni  1 n j 1

Pattern Classification, Chapter 10

• Alg. 3 is sequential version of the k-means alg.

• Alg. 3 updates each time a sample is reclassified
• k-means waits until n samples have been reclassified
before updating

• Alg 3 can get trapped in local minima

• Depends on order of the samples
• However, it is at least a stepwise optimal procedure,
and it can be easily modified to apply to problems in
which samples are acquired sequentially and
clustering must be done on-line.

• Alg 2 is the fuzzy k-means clustering. Pattern Classification, Chapter 10

Selecting the Number of
Clusters (Elbow Method)
31

Hierarchical Clustering
• Many times, clusters are not disjoint, but a cluster
may have subclusters, in turn having sub-
subclusters, etc.
• Consider a sequence of partitions of the n
samples into c clusters
• The first is a partition into n cluster, each one
containing exactly one sample
• The second is a partition into n-1 clusters, the third into
n-2, and so on, until the n-th in which there is only one
cluster containing all of the samples
• At the level k in the sequence, c = n-k+1.
Pattern Classification, Chapter 10
32

• Given any two samples x and x’, they will be grouped

together at some level, and if they are grouped a level k,
they remain grouped for all higher levels
• Hierarchical clustering  tree representation called
dendrogram

Pattern Classification, Chapter 10

• Are groupings natural or forced: check similarity values

• Evenly distributed similarity  no justification for grouping

• Another representation is based on set, e.g., on the Venn

diagrams

Pattern Classification, Chapter 10

• Hierarchical clustering can be divided in

agglomerative and divisive.

• Agglomerative (bottom up, clumping): start with n

singleton cluster and form the sequence by
merging clusters

• Divisive (top down, splitting): start with all of the

samples in one cluster and form the sequence by
successively splitting clusters

Pattern Classification, Chapter 10

Agglomerative hierarchical clustering

• The procedure terminates when the specified

number of cluster has been obtained, and returns
the cluster as sets of points, rather than the mean
or a representative vector for each cluster
Pattern Classification, Chapter 10
36

• At any level, the distance between nearest clusters

can provide the dissimilarity value for that level
• To find the nearest clusters, one can use
d min ( Di , D j )  min x  x'
xDi , x 'D j

d max ( Di , D j )  max x  x'

xDi , x 'D j

1
d avg ( Di , D j ) 
ni n j
  x  x'
xDi x 'D j

d mean ( Di , D j )  m i  m j
which behave quite similar of the clusters are
hyperspherical and well separated.
• The computational complexity is O(cn2d2), n>>c
Pattern Classification, Chapter 10
• Nearest-neighbor algorithm (single linkage)
37

• dmin is used
• The use of dmin as a distance measure and the
agglomerative clustering:-

Pattern Classification, Chapter 10

• The farthest neighbour algorithm (complete linkage)
38

• dmax is used
• When two clusters are merged, the graph is changed by adding
edges between every pair of nodes in the 2 clusters

• All the procedures involving minima or maxima are sensitive to

outliers. The use of dmean or davg are natural compromises
Pattern Classification, Chapter 10
Applications of Clustering
▪ Image segmentation
▪ Find interesting “objects” in images to focus attention at

From: Image Segmentation by Nested Cuts, O. Veksler, CVPR2000

Applications of Clustering
▪ Image Database Organization
▪ for efficient search
Applications of Clustering
▪ Data Mining
▪ Technology watch
▪ Derwent Database, contains all patents filed in the
last 10 years worldwide
▪ Searching by keywords leads to thousands of
documents
▪ Find clusters in the database and find if there are any
emerging technologies and what competition is up to
▪ Marketing
▪ Customer database
▪ Find clusters of customers and tailor marketing
schemes to them
Applications of Clustering
▪ Profiling Web Users
▪ Use web access logs to generate a feature vector for
each user
▪ Cluster users based on their feature vectors
▪ Identify common goals for users
▪ Shopping
▪ Job Seekers
▪ Product Seekers
▪ Tutorials Seekers
▪ Can use clustering results to improving web content and
design
Summary
▪ Clustering (nonparametric unsupervised learning)
is useful for discovering inherent structure in data
▪ Clustering is immensely useful in different fields
▪ Clustering comes naturally to humans (in up to 3
dimensions), but not so to computers
▪ It is very easy to design a clustering algorithm, but
it is very hard to say if it does anything good
▪ General purpose clustering does not exist, for best
results, clustering should be tuned to application at
hand

Unsupervised Learning: Clustering
No ratings yet
Unsupervised Learning: Clustering
57 pages
cz4041 10 Clustering
No ratings yet
cz4041 10 Clustering
67 pages
A Comprehensive Overview of Clustering Algorithms in Pattern Recognition
No ratings yet
A Comprehensive Overview of Clustering Algorithms in Pattern Recognition
8 pages
Clustering Part1
No ratings yet
Clustering Part1
84 pages
Unsupervised Learning: K-Means Clustering
No ratings yet
Unsupervised Learning: K-Means Clustering
23 pages
DM 4
No ratings yet
DM 4
76 pages
22AIP3101A Session 9
No ratings yet
22AIP3101A Session 9
38 pages
Clustering Lecture
No ratings yet
Clustering Lecture
46 pages
2022 Istdm 06
No ratings yet
2022 Istdm 06
76 pages
DSML-ML09. Unsupervised Learning
No ratings yet
DSML-ML09. Unsupervised Learning
69 pages
FML Unit4
No ratings yet
FML Unit4
14 pages
Cluster Analysis 1731695796
No ratings yet
Cluster Analysis 1731695796
91 pages
Unsupervised Learning 1
No ratings yet
Unsupervised Learning 1
40 pages
Clustering
No ratings yet
Clustering
38 pages
Cluster
100% (1)
Cluster
72 pages
Unit 4 Clustering - K-Means and Hierarchical
No ratings yet
Unit 4 Clustering - K-Means and Hierarchical
40 pages
Clustering Slides
No ratings yet
Clustering Slides
22 pages
Datamining-Lect5 - Clustering. The K-Means Algorithm. Hierarchical Clustering. The DBSCAN Algorithm. Clustering Evaluation
No ratings yet
Datamining-Lect5 - Clustering. The K-Means Algorithm. Hierarchical Clustering. The DBSCAN Algorithm. Clustering Evaluation
110 pages
Clustering Techniques for Analysts
No ratings yet
Clustering Techniques for Analysts
7 pages
Lecture Unsupervised (17!04!2024)
No ratings yet
Lecture Unsupervised (17!04!2024)
61 pages
10 Lecture AI 10
No ratings yet
10 Lecture AI 10
48 pages
Datamining Lect8
No ratings yet
Datamining Lect8
79 pages
Chapter 3 Unsupervised Learning
No ratings yet
Chapter 3 Unsupervised Learning
45 pages
Data Mining Lecture Notes-1: Bsc. (H) Computer Science: Vi Semester Teacher: Ms. Sonal Linda
No ratings yet
Data Mining Lecture Notes-1: Bsc. (H) Computer Science: Vi Semester Teacher: Ms. Sonal Linda
40 pages
ML4 Unsupervised Learning
No ratings yet
ML4 Unsupervised Learning
60 pages
w6 Clustering
No ratings yet
w6 Clustering
29 pages
Week 6 AM Slides
No ratings yet
Week 6 AM Slides
39 pages
Clustering
No ratings yet
Clustering
80 pages
ML Clustering
No ratings yet
ML Clustering
33 pages
Unsupervised Learning Explained
No ratings yet
Unsupervised Learning Explained
54 pages
Clustering Algorithm
No ratings yet
Clustering Algorithm
47 pages
A Tutorial On Clustering Algorithms
No ratings yet
A Tutorial On Clustering Algorithms
4 pages
A Comprehensive Survey of Clustering Algorithms
No ratings yet
A Comprehensive Survey of Clustering Algorithms
30 pages
Comparative Analysis of Clustering Techniques
No ratings yet
Comparative Analysis of Clustering Techniques
13 pages
Cluster Center Initialization Algorithm For K-Means Clustering
No ratings yet
Cluster Center Initialization Algorithm For K-Means Clustering
10 pages
Clustering (Class 38-39)
No ratings yet
Clustering (Class 38-39)
45 pages
Fuzzy Meaning
No ratings yet
Fuzzy Meaning
6 pages
Pattern Recognition Letters: Krista Rizman Z Alik
No ratings yet
Pattern Recognition Letters: Krista Rizman Z Alik
7 pages
Unsupervised Learning: Clustering
No ratings yet
Unsupervised Learning: Clustering
12 pages
ML Module 4 2022 1 PDF
No ratings yet
ML Module 4 2022 1 PDF
31 pages
Machine Learning Notes-1 (Clustering-1)
No ratings yet
Machine Learning Notes-1 (Clustering-1)
25 pages
MLT Unit 1 Vaishali
No ratings yet
MLT Unit 1 Vaishali
44 pages
DMDWUNITV
No ratings yet
DMDWUNITV
72 pages
Clustering
No ratings yet
Clustering
104 pages
Chapter 6
No ratings yet
Chapter 6
54 pages
UNIT5
No ratings yet
UNIT5
60 pages
3.k-Metoids and Hierarchical Updated
No ratings yet
3.k-Metoids and Hierarchical Updated
50 pages
07 Clustering 2024
No ratings yet
07 Clustering 2024
51 pages
Advanced Cluster Analysis Guide
No ratings yet
Advanced Cluster Analysis Guide
49 pages
Graph Partitioning & Clustering Techniques
No ratings yet
Graph Partitioning & Clustering Techniques
14 pages
Clustering
No ratings yet
Clustering
6 pages
Clustering
No ratings yet
Clustering
65 pages
Unsupervised Learning Essentials
No ratings yet
Unsupervised Learning Essentials
29 pages
ML DSBA Lab7
No ratings yet
ML DSBA Lab7
6 pages
07 Clustering
No ratings yet
07 Clustering
34 pages
Fds Unit03
No ratings yet
Fds Unit03
11 pages
Lecture 15 Unsupervised Clustering
No ratings yet
Lecture 15 Unsupervised Clustering
73 pages
Marking Scheme - 65/S/2 Q.No. Expected Answer / Value Points Section-A (Question Nos. 1 To 18 Are Multiple Choice Questions Carrying 1 Mark Each) 1
100% (2)
Marking Scheme - 65/S/2 Q.No. Expected Answer / Value Points Section-A (Question Nos. 1 To 18 Are Multiple Choice Questions Carrying 1 Mark Each) 1
19 pages
220C3A
No ratings yet
220C3A
2 pages
LLM and Gen AI
No ratings yet
LLM and Gen AI
4 pages
BPM Assignment 4
No ratings yet
BPM Assignment 4
5 pages
5CS3-01 Information Theory and Coding Shanu Tripathi
No ratings yet
5CS3-01 Information Theory and Coding Shanu Tripathi
160 pages
IC Daily Sales KPI Report 11538
No ratings yet
IC Daily Sales KPI Report 11538
5 pages
Question Level: Easy: Part A (20 1 20)
No ratings yet
Question Level: Easy: Part A (20 1 20)
9 pages
Distance Sort
No ratings yet
Distance Sort
6 pages
AI in 100 Images
No ratings yet
AI in 100 Images
104 pages
Simulation Modeling and Analysis: Averill M. Law
50% (2)
Simulation Modeling and Analysis: Averill M. Law
9 pages
4 Deep Learning and AI Development Framework Lab Guide - MindSpore
No ratings yet
4 Deep Learning and AI Development Framework Lab Guide - MindSpore
64 pages
Chapter 2 & 3 LP
No ratings yet
Chapter 2 & 3 LP
74 pages
Electronics
No ratings yet
Electronics
2 pages
Report
No ratings yet
Report
10 pages
Unit 4
No ratings yet
Unit 4
24 pages
Computer Vision Mini Project Report
No ratings yet
Computer Vision Mini Project Report
7 pages
CSC2411 - Linear Programming and Combinatorial Optimization Lecture 1: Introduction To Optimization Problems and Mathematical Programming
No ratings yet
CSC2411 - Linear Programming and Combinatorial Optimization Lecture 1: Introduction To Optimization Problems and Mathematical Programming
9 pages
Hw5sol PDF
No ratings yet
Hw5sol PDF
19 pages
Deep Learning Interview Questions and Answers
No ratings yet
Deep Learning Interview Questions and Answers
21 pages
Bayesian Phylogenetic Dating Guide
No ratings yet
Bayesian Phylogenetic Dating Guide
25 pages
Linear Programming Quiz
100% (1)
Linear Programming Quiz
9 pages
Lec 10-Kleens Theorem NFA
No ratings yet
Lec 10-Kleens Theorem NFA
23 pages
BTech. 4th Year - Computer Science and Engineering - Hindi - 2024-25 - v2
No ratings yet
BTech. 4th Year - Computer Science and Engineering - Hindi - 2024-25 - v2
20 pages
Machine Learning (ML) in Medicine - Review, Applications, and Challenges PDF
No ratings yet
Machine Learning (ML) in Medicine - Review, Applications, and Challenges PDF
52 pages
Linear - Regression - 01
No ratings yet
Linear - Regression - 01
81 pages
Ch1.6 Distribution of Discrete R.V.: Binomial Distribution
No ratings yet
Ch1.6 Distribution of Discrete R.V.: Binomial Distribution
4 pages
Power System Operation and Control
50% (2)
Power System Operation and Control
2 pages
MATLAB Digital Comm Assignment
No ratings yet
MATLAB Digital Comm Assignment
3 pages
Soft Computing for CS Students
No ratings yet
Soft Computing for CS Students
19 pages
LT CC Slides Lecture 1
No ratings yet
LT CC Slides Lecture 1
59 pages

02 - Clustering

Uploaded by

02 - Clustering

Uploaded by

CSE464 & CSE468 : Image

Processing and Pattern

▪ New Topic: Unsupervised Learning

▪ In the next few lectures we consider

▪ What is a good clustering?

large d, small s large s, small d

2. Criterion function to evaluate a clustering

good clustering bad clustering

3. Algorithm to compute clustering

▪ For character recognition, no invariance to rotation

▪ Manhattan (city block) distance

▪ approximation to Euclidean distance,

▪ the smaller the angle, the larger the

▪ Then the sum-of-squared errors criterion function (to

▪ Note that the number of clusters, c, is fixed

▪ SSE criterion appropriate when data forms compact

▪ SSE criterion favors equally sized clusters, and may

JSSE  ΣΣ|| x  i ||2

▪ for a different objective function, we need a different optimization algorithm, of

▪ Fix number of clusters to k (c = k)

▪ k-means is probably the most famous clustering algorithm

3. reassign all samples to the x

4. if clusters changed at step 3, go to step 2

▪ Its solution can be used as a starting point for

▪ Still 100’s of papers on variants and improvements

Pattern Classification, Chapter 10

The trace (sum of diagonal elements) is the

• proportional to the sum of the variances in the

Pattern Classification, Chapter 10

• As tr[ST] = tr[SW] + tr[SB] and tr[ST] is independent from

• However, seeking to minimize the within-cluster criterion

where m is the total mean vector:

Pattern Classification, Chapter 10

• Typically iterative optimization used:

• consider an iterative procedure to minimize the

where Ji is the effective error per cluster.

• Moving sample x̂ from cluster Di to Dj, changes

Pattern Classification, Chapter 10

• Hence, the transfer is advantegeous if the

Pattern Classification, Chapter 10

• Alg. 3 is sequential version of the k-means alg.

• Alg 3 can get trapped in local minima

• Alg 2 is the fuzzy k-means clustering. Pattern Classification, Chapter 10

• Given any two samples x and x’, they will be grouped

Pattern Classification, Chapter 10

• Are groupings natural or forced: check similarity values

• Another representation is based on set, e.g., on the Venn

Pattern Classification, Chapter 10

• Hierarchical clustering can be divided in

• Agglomerative (bottom up, clumping): start with n

• Divisive (top down, splitting): start with all of the

Pattern Classification, Chapter 10

Agglomerative hierarchical clustering

• The procedure terminates when the specified

• At any level, the distance between nearest clusters

d max ( Di , D j )  max x  x'

Pattern Classification, Chapter 10

• All the procedures involving minima or maxima are sensitive to

From: Image Segmentation by Nested Cuts, O. Veksler, CVPR2000

You might also like