0% found this document useful (0 votes)

17 views52 pages

Spectral Clustering

The document provides an overview of spectral clustering within the context of machine learning, highlighting its advantages over traditional clustering methods like KMeans. It discusses the importance of similarity functions, kernel methods, and the use of adjacency matrices in clustering analysis. Additionally, it includes recommended bibliography for further reading on the topic.

Uploaded by

Ramón García

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views52 pages

Spectral Clustering

Uploaded by

Ramón García

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 52

Spectral Clustering

Machine Learning. Block-2: Kernel Methods

Emilio Parrado-Hernández, emilio.parrado@uc3m.es

emilio.parrado@uc3m.es

March 1, 2022

Acrónimo + Nombre 3 líneas

09
Index

1 Introduction to Spectral Clustering

Review of clustering

2 KMeans limitations

3 Spectral Clustering
Kernels as similarities
Graph partitioning
K-Prototypes spectral clustering

4 Tricks and rules of thumb

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 2 / 44

Recommended Bibliography

Bach and Jordan, Learning Spectral Clustering, With Application To

Speech Separation, Jornal of Machine Learning Research, 7, 1963-2001,
2006
Ng, Jordan and Wiess, On Spectral Clustering: Analysis and an
algorithm, NeurIPS 2001
Azran and Ghahramani, A new approach to data driven clustering,
International Conference on Machine Learning (ICML) 2006
Shi and Malik, Normalized cuts and image segmentation, IEEE
Transactions on Pattern Analysis and Machine Intelligence, 22(8), 2000

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 3 / 44

Index

1 Introduction to Spectral Clustering

Review of clustering

2 KMeans limitations

3 Spectral Clustering
Kernels as similarities
Graph partitioning
K-Prototypes spectral clustering

4 Tricks and rules of thumb

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 4 / 44

Review of clustering

Clustering is about finding groups in data

Usually framed as unsupervised learning since there is no need of labelled
data
Widely used in exploratory data analysis, or machine learning for humans
Useful in cases where collecting data is easy but labelling data is difficult
Applications:
I Image segmentation
I Speaker/audio segmentation
I Auto-Organisation of a collection of documents
I Data compression
I Biomedical applications like clustering sequences of proteins
I Marketing and sales (groups of customers)
Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 5 / 44

Defining groups can be controversial

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 6 / 44

Defining groups can be controversial

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 6 / 44

Defining groups can be controversial

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 6 / 44

Defining groups can be controversial

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 6 / 44

Defining groups can be controversial

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 6 / 44

Notation

Data: n observations with d features

x>
   
1 [x1,1 x1,2 ··· x1,d−1 x1,d ]
 x> 2
  [x2,1 x2,2 ··· x2,d−1 x2,d ]
 x>
   
3
  [x3,1 x3,2 ··· x3,d−1 x3,d ]
X= . = .
   
. . .. .. .. ..

 .   .
   . . . .


 x>n−1
  [x n−1,1 x n−1,2 ··· xn−1,d−1 xn−1,d ] 
x>n [xn,1 xn,2 ··· xn,d−1 xn,d ]

Groups: K. In many clustering algorithms K is an input. Spectral

clustering will provide with methods to select K.

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 7 / 44

Similarity functions and distances
Data samples that are similar should end up in the same group, while
data samples that are different should end up in different groups
The starting point in clustering analysis is the definition of a similarity
function or a distance that measures how close are patterns
I Similarity

→1 xi , xj should be in the same cluster
s(xi , xj )
→0 xi , xj should be in different clusters
I Distance

→0 xi , xj should be in the same cluster
d(xi , xj )
→∞ xi , xj should be in different clusters
Adjacency matrix:
n × n matrix that stores all the
paired similarities or distances
between training data. Usually with
zeroes in the diagonal. All we need to Acrónimo + Nombre 3 líneas

cluster data is here

(emilio.parrado@uc3m.es) March 1, 2022 8 / 44

Links between Kernel Methods and clustering

Kernel methods propose to solve machine learning problems in an

unknown feature space induced by a kernel.
KM are an easy framework to obtain non-linear versions of common
linear algorithms.
Application of the kernel trick to clustering algorithms whose similarity or
distance can be expressed in terms of scalar products: kernel K-means
The kernel function itself can be regarded as a similarity measure (it’s
an scalar product, symmetric and positive semidefinite), therefore the
Kernel Matrix can be used to construct an adjacency matrix that can
be decomposed in clusters by means of spectral clustering techniques.

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 9 / 44

Index

1 Introduction to Spectral Clustering

Review of clustering

2 KMeans limitations

3 Spectral Clustering
Kernels as similarities
Graph partitioning
K-Prototypes spectral clustering

4 Tricks and rules of thumb

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 10 / 44

Review of KMeans

KMeans Algorithm
Input:
I K: number of clusters
I X: data matrix, each observation in a row
Output: K centroids, m1 , m2 , . . . , mK
1 Initialise the centroids at random
2 Assign observations to clusters

xj is assigned to cluster Ck if mk = argminkxj − mk k2

3 Recompute centroids
1 X
mk = xj , k = 1, . . . , K
Nk
xj ∈Ck
Acrónimo + Nombre 3 líneas

4 Iterate 2 and 3 until convergence

(emilio.parrado@uc3m.es) March 1, 2022 11 / 44

Example of KMeans

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 12 / 44

Example of KMeans

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 12 / 44

Less typical geometry

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 13 / 44

Less typical geometry

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 13 / 44

Less typical geometry

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 13 / 44

Less typical geometry

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 13 / 44

Index

1 Introduction to Spectral Clustering

Review of clustering

2 KMeans limitations

3 Spectral Clustering
Kernels as similarities
Graph partitioning
K-Prototypes spectral clustering

4 Tricks and rules of thumb

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 14 / 44

Index

1 Introduction to Spectral Clustering

Review of clustering

2 KMeans limitations

3 Spectral Clustering
Kernels as similarities
Graph partitioning
K-Prototypes spectral clustering

4 Tricks and rules of thumb

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 15 / 44

Intuitive view of spectral clustering
The adjacency matrix contains all the information that we need to split
a dataset in groups.
In kernel methods, the kernel matrix replaces the matrix with the input
data observations (typically the X matrix) as input to the training
algorithm.
The kernel matrix can be considered an adjacency matrix as the kernel
function can be interpreted as a scalar product in feature space
Think about an ideal kernel or similarity function for this dataset:

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 16 / 44

Ideal kernel matrix

 
1 1 1 1 1 0 0 0 0 0 0 0

 1 1 1 1 1 0 0 0 0 0 0 0 


 1 1 1 1 1 0 0 0 0 0 0 0 


 1 1 1 1 1 0 0 0 0 0 0 0 


 1 1 1 1 1 0 0 0 0 0 0 0 

 0 0 0 0 0 1 1 1 1 0 0 0 
K= 

 0 0 0 0 0 1 1 1 1 0 0 0 


 0 0 0 0 0 1 1 1 1 0 0 0 


 0 0 0 0 0 1 1 1 1 0 0 0 


 0 0 0 0 0 0 0 0 0 1 1 1 

 0 0 0 0 0 0 0 0 0 1 1 1 
0 0 0 0 0 0 0 0 0 1 1 1
Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 17 / 44

Spectral Analysis of the ideal kernel matrix

   >
−0.45 0 0 −0.45 0 0

 −0.45 0 0 


 −0.45 0 0 


 −0.45 0 0 


 −0.45 0 0 


 −0.45 0 0 


 −0.45 0 0 

 −0.45 0 0 
 5
 −0.45 0 0 
 0 0  
 0 −0.5 0 
 0
 0 −0.5 0 
K= 4 0  
 0 −0.5 0 
 0
 0 −0.5 0 
 0 3  

 0 −0.5 0 


 0 −0.5 0 


 0 −0.5 0 


 0 −0.5 0 


 0 0 −0.58 


 0 0 −0.58 

 0 0 −0.58   0 0 −0.58 
0 0 −0.58 0 0 −0.58
Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 18 / 44

Clusters in the spectrum

The structure of the spectrum of the Kernel Matrix with this

ideal kernel reflects the cluster structure within data
I The number of non-zero eigenvalues equals the number of clusters
I Rows of the eigenvectors matrix are identical for points in the same
cluster
Real kernel:
I Kernel of instances in the same cluster will not always reach a maximum
value of one
I Kernel of instances in different clusters will not be zero
Matrix perturbation analysis of the ideal situation leads to spectral
clustering: a means of retrieving the underlying clusters from the
spectrum of the kernel matrix of a non-ideal kernel.
Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 19 / 44

Real kernel matrices

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 20 / 44

Index

1 Introduction to Spectral Clustering

Review of clustering

2 KMeans limitations

3 Spectral Clustering
Kernels as similarities
Graph partitioning
K-Prototypes spectral clustering

4 Tricks and rules of thumb

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 21 / 44

Links with graph partitioning

The affinity matrix and training data samples define a graph in which:
I Each training sample becomes a node in the graph
I A positive value in position A(i, j) of the affinity matrix is the edge
connecting the nodes of the graph corresponding to xi and xj .
Therefore, finding clusters in the data can be addressed as cutting the
graph in subgraphs by removing weak edges that connect disjoint subsets
of heavily connected nodes.
Normalised-Cut graph partitioning

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 22 / 44

Partition graph by cutting the blue edge

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 23 / 44

Normalized cut algorithm

Given two sets of nodes S1 and S2 , define W (S1 , S2 ) as the weight of the
edges connecting S1 and S2 :
X X
W (S1 , S2 ) = Ai,j
xi ∈S1 xj ∈S2

For a graph with a set of nodes V we define the Normalised Cut that
partitions V in K disjoint clusters
K
X W (Vk , V \ Vk )
V = ∪K
k=1 Vk as
W (Vk , V )
k=1

Finding the optimal normalized cut is equivalent to an eigenvalue

problem with the Laplacian Matrix of the graph
Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 24 / 44

Example with python notebook

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 25 / 44

Alternative normalized cut formulation

K K
X W (Vk , V \ Vk ) X e>
k (D − A)ek
=
k=1
W (Vk , V )
k=1
e>
k Dek

where each ek is a vector with n components that encodes cluster Ck as

1 if xi ∈ Ck
(ek )i = ,
0 otherwise

A is the n × n adjacency matrix and D is a diagonal matrix with the sum of

all the elements in the rows of A
n
X
Dii = Aij
j=1
Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 26 / 44

Alternative normalized cut formulation

The denominator:
W (Vk , V ) = e>
k Dek

each element Dii in the diagonal of D is the weight of all the edges
connecting xi with the other observations, so e>
k Dek sums all the Dii
that correspond to elements in the same cluster.
The numerator
W (Vk , V \ Vk ) = e>
k (D − A)ek

accounts for the weight of the connections of the elements in each cluster
with the elements not in the cluster (the edges we would cut if we decide
to apply that partition). e>
k Dek contains the weight of all the edges
connnecting the elements of the cluster with the whole dataset; e> k Aek
contains the weight of the edges among members of the cluster, that is
Ai,j will enter the sum as long as (ek )i = 1 and (ek )j = 1, that is, xi and
xj are members of Ck . Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 27 / 44

Normalized cut minimization
The minimization of
K
X e> (D − A)ek
k

k=1
e>
k Dek

can be relaxed into an eigenspectrum problem:

minimize e>k (D − A)ek implies look for eigenvectors of D − A with a
small eigenvalue
maximize e> > −1
k Dek is equivalent to minimize ek D ek , that is, look for
eigenvectors of D with small eigenvalue.
Combining both goals:

min e>
kD
−1/2
(D − A)D−1/2 ek =

min I − e>
kD
−1/2
AD−1/2 ek
Acrónimo + Nombre 3 líneas

which is equivalent to find the eigenvectors of D−1/2 AD−1/2 with higher

eigenvalues

(emilio.parrado@uc3m.es) March 1, 2022 28 / 44

Spectral Clustering [Relaxed] Algorithm

Input:
I K: number of clusters
I A: Affinity matrix
Output: K clusters, each cluster being a list of input observations
C1 , C2 , . . . , CK
1 Zero the elements in the diagonal of A
Pn
2 Construct a diagonal matrix D where the element Di,i = j=1 Ai,j
−1/2 −1/2
3 Construct Laplacian of the graph L = D AD
4 Find the K eigenvectors of L with largest eigenvalue, z1 , z2 , . . . , zK
5 Form matrix Z stacking the eigenvectors in columns Zn×K = [z1 , . . . , zK ]
Z
6 Form matrix Y normalising Z row-wise Yi,j = √Pi,j 2
j Zi,j
7 Perform a Kmeans clustering in the rows of Y and assign instance xj to Acrónimo + Nombre 3 líneas

cluster Ck if the j-th row of Y falls in cluster Ck

(emilio.parrado@uc3m.es) March 1, 2022 29 / 44

Index

1 Introduction to Spectral Clustering

Review of clustering

2 KMeans limitations

3 Spectral Clustering
Kernels as similarities
Graph partitioning
K-Prototypes spectral clustering

4 Tricks and rules of thumb

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 30 / 44

A random walk view of Spectral Clustering

Consider data samples (nodes in the graph) are locations.

Edges in the graph (kernel evaluations of pairs of samples) can be
regarded as easiness to travel between the nodes they connect. Given a
node xi and two neighbours xj and xk , κ(xi , xj ) > κ(xi , xk ) means that
it’s easier to travel from xi to xj than from xi to xk
A normalised Affinity Matrix so that all the rows sum up to one gives the
probabilities of travelling from each node to its neighbours in one jump.
I Define A as a kernel matrix with zeros in the diagonal.
I DefineP D as the diagonal matrix with the sum of the elements in each row:
Dii = j Aij
I Then P = D−1 A gives the probabilities of transitioning from each node to
its neighbours in one jump
I Pij is the probability of being in node xi and jumping to xj
Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 31 / 44

A random walk view of Spectral Clustering

If a traveller starts in each node, after M iterations of the random walk

algorithm (jump from another node following the probabilities of P ) the
distribution of travellers in the nodes is given by

Pi,j (M ) = Probability of i-th traveller being in node xj after M jumps

M
Pi,j (M ) = Pi,j
If the graph is clearly divided into clusters, random walk exploration
would end up pointing that travellers that start in one cluster seldom
change cluster; they rather wander around within the nodes of the same
cluster
After a large number of iterations, probability distributions of travellers
that started in the same cluster will be similar Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 32 / 44

Affinity matrix and transition matrix
 
0 0.8 0 0.7 0 0 0

 0.8 0 0 0.8 0 0 0 


 0 0 0 0.9 0 0 0 

A=
 0.7 0.8 0.9 0 0.05 0 0 


 0 0 0 0.05 0 0.7 0.9 

 0 0 0 0 0.7 0 0.8 
0 0 0 0 0.9 0.8 0

 
0. 0.53 0. 0.47 0. 0. 0.

 0.5 0. 0. 0.5 0. 0. 0. 


 0. 0. 0. 1. 0. 0. 0. 

P =
 0.29 0.33 0.37 0. 0.02 0. 0. 


 0. 0. 0. 0.03 0. 0.42 0.55 

 0. 0. 0. 0. 0.47 0. 0.53 
Acrónimo + Nombre 3 líneas

0. 0. 0. 0. 0.53 0.47 0.

(emilio.parrado@uc3m.es) March 1, 2022 33 / 44

Transition matrices after M jumps

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 34 / 44

Retrieve clusters from the transition matrix

Identify clusters of rows of the matrix with the probabilities of

travellers ending in each node.
K-means with the rows of P M as data
Rows are discrete probabilities, therefore a natural distance can be the
Kullback-Leibler divergence:

KL divergence
If Qk is a prototype (centroid) for the algorithm K-prototypes (K-means with
the rows of P M ), then the distance with a row of P M can be computed as:
n
X PijM
KL PiM ||Qk = PijM ln
j=1
Qkj

where PiM is the i-th row of P M Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 35 / 44

K-Prototypes Algorithm

Input:
I K: number of clusters
I P M : transition matrix after M iterations in the graph
Output:
I K clusters, each cluster being a list of input observations C1 , C2 , . . . , CK
I K Prototypes Q1 , Q2 , . . . , QK (centroids)
1 Initialize the prototypes (for instace choose K rows of P M at random)
2 Assign observations to clusters. Observation xi goes to cluster Ck if

k = argmink KL PiM ||Qk

3 Recompute prototypes

Qnew = average PiM : PiM ∈ Vk ,

k k = 1, . . . , K
Acrónimo + Nombre 3 líneas

4 Iterate 2 and 3 until convergence

(emilio.parrado@uc3m.es) March 1, 2022 36 / 44

K Prototypes Algorithm and hierarchical clustering
M is critical to retrieve the proper cluster structure
As M increases the final probability distribution of each node converges
to a steady state that does not change although M keeps on increasing

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 37 / 44

Determine M and K from the training data

Eigenvalues of P are smaller than 1. If (λi , vi ) are pairs (eigenvalue,

eigenvector) of P , then you can prove that |λi | ≤ 1, i = 1, . . . , n.
  > 
λ1 0 . . . 0 v1
 0 λ 2 . . . 0   v2> 
P = v1 v2 . . . vn  .
  
.. . .   .. 
 .. . . 0   . 
0 0 ... λn vn>
Since

v1>
 
 v2> 
 v1 v2 ... vn =I
 
 ..
 . 
vn>
Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 38 / 44

Determine M and K from the training data

then
 >
λM
 
1 0 ... 0 v1
 0 λM
2 ... 0   v2> 
PM =

v1 v2 ... vn
  
.. .. ..  .
0   ..
 
 . . . 
0 0 ... λM
n vn>

The spectrum of P M is formed by (λM

i , vi ), i = 1, . . . , n

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 39 / 44

Interpretation of the spectrum of P M
Each pair (λi , vi ) of P represents a data structure that can be unvealed
running random walks in the graph.
If |λi | is close to 1, that structure will survive the random walks and
unveal stable clusters of nodes of the graph. In that sense, the smaller the
value of |λi |, the sooner this value will decay as M increases and therefore
that data structure will dissolve into more stable clusters
Learn M given K:
Find the value of M that yields a robuster structure of K clusters. In
other words, it has to maximize the distance between eigenvalues λM K and
λM
K+1

MK = argmaxM even |λK |M + |λK+1 |M


log(|λK+1 |)

log log(|λK |)
MK =  log(|λK |)
 Acrónimo + Nombre 3 líneas

log(|λK+1 |)
M even

(emilio.parrado@uc3m.es) March 1, 2022 40 / 44

Index

1 Introduction to Spectral Clustering

Review of clustering

2 KMeans limitations

3 Spectral Clustering
Kernels as similarities
Graph partitioning
K-Prototypes spectral clustering

4 Tricks and rules of thumb

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 41 / 44

Selection of the number of clusters

Clustering is a sort of an art within machine learning in the sense that it

involves a lot of human intervention
In fact, exploratory data analysis is more human oriented than machine
oriented.
In K-means the number of clusters has to be input by the user with prior
domain knowledge
Spectral clustering provides more hints to select an appropriate number of
clusters related with the number of significant eigenvalues in the spectrum
of the Laplacian or affinity matrices that are used to partition the data.
Many rules of thumb are proposed along these ideas.

Acrónimo + Nombre 3 líneas

(emilio.parrado@uc3m.es) March 1, 2022 42 / 44

Initialisation of the centroids/prototypes, etc

Spectral clustering methods resource to a core implementation of Kmeans

in the space of the rows of the matrix formed stacking as columns the
eigenvectors of the Laplacian matrix (or of the P M matrix in the random
walks case).
Kmeans converges: it reaches a number of iterations after which the
configuration of the clusters doesn’t change. But its convergence is to a
local minimum.
Therefore the initialisation of the centroids is critical to ensure this local
minimum is good enough for our purposes.
There are some heuristic strategies that aim at covering most of the space
covered by the training samples in the selection of the centroids. Two of
these accepted heuristics are:
1 Choose at random K training samples as initial centroids
2 Choose as first centroid the sample average of the training samples. Then
centroids 2, . . . , K are selected as the training samples that are further
Acrónimo + Nombre 3 líneas

from the previously selected centroids.

(emilio.parrado@uc3m.es) March 1, 2022 43 / 44

Evaluation of the clustering quality
The evaluation of the quality of a clustering is usually very subjective. In
exploratory data contexts one asks the clusters to be meaningful and
aligned with the story that the data are telling, but there is a lack of
supervised elements that enable the implementation of objective
performance measures.
Most clustering applications end up discussing dispersion, a measure of
compactness of the cluster that gives an idea about how similar are the
members of a same cluster. And how different they are from members of
other clusters.
A commonly extend method is to run several clustering algorithms in
parallel and compare their outcome. The underlying principle here
resembles that of ensemble methods. If two samples are similar, they are
likely to end up as members of the same cluster independently. Mutual
Information Gain can be used to compare two partitions
NC X
NG
X p(x ∈ Ck , x ∈ Gj ) Acrónimo + Nombre 3 líneas

I(C; G) = p(x ∈ Ck , x ∈ Gj ) log

j=1
p(x ∈ Ck )p(x ∈ Gj )
k=1

(emilio.parrado@uc3m.es) March 1, 2022 44 / 44

DS303 Clustering
No ratings yet
DS303 Clustering
20 pages
A Distributed Framework For Trimmed Kernel K-Means Clustering
No ratings yet
A Distributed Framework For Trimmed Kernel K-Means Clustering
14 pages
Kernel Clustering for CSE Students
No ratings yet
Kernel Clustering for CSE Students
57 pages
ML DSBA Lab7
No ratings yet
ML DSBA Lab7
6 pages
Machine Learning Syllabus
No ratings yet
Machine Learning Syllabus
73 pages
ICS E4030 Lecture1
No ratings yet
ICS E4030 Lecture1
37 pages
Kernel K-Means, Spectral Clustering and Normalized Cuts: Inderjit S. Dhillon Yuqiang Guan Brian Kulis
No ratings yet
Kernel K-Means, Spectral Clustering and Normalized Cuts: Inderjit S. Dhillon Yuqiang Guan Brian Kulis
6 pages
Kernel Methods For General Pattern Analysis PDF
No ratings yet
Kernel Methods For General Pattern Analysis PDF
77 pages
Handbook of Cluster Analysis: C. Hennig, M. Meila, F. Murtagh, R. Rocci (Eds.)
No ratings yet
Handbook of Cluster Analysis: C. Hennig, M. Meila, F. Murtagh, R. Rocci (Eds.)
28 pages
03 23MAT214 MIS4 KMeans Spectral Clustering
No ratings yet
03 23MAT214 MIS4 KMeans Spectral Clustering
52 pages
Entropy: Kernel Spectral Clustering For Big Data Networks
No ratings yet
Entropy: Kernel Spectral Clustering For Big Data Networks
20 pages
Data Clustering (Contd) : CS771: Introduction To Machine Learning Piyush Rai
No ratings yet
Data Clustering (Contd) : CS771: Introduction To Machine Learning Piyush Rai
15 pages
(2008) A Survey of Kernel and Spectral Methods For Clustering
No ratings yet
(2008) A Survey of Kernel and Spectral Methods For Clustering
38 pages
Machine Learning
No ratings yet
Machine Learning
15 pages
I Jcs It 2015060141
No ratings yet
I Jcs It 2015060141
5 pages
SP14 CS188 Lecture 23 - Kernels and Clustering - Print
No ratings yet
SP14 CS188 Lecture 23 - Kernels and Clustering - Print
39 pages
Clustering Techniques Overview
No ratings yet
Clustering Techniques Overview
22 pages
2021 Clustering
No ratings yet
2021 Clustering
50 pages
LAB6
No ratings yet
LAB6
4 pages
Chapter 3 Unsupervised Learning
No ratings yet
Chapter 3 Unsupervised Learning
45 pages
Lecture 3. Partitioning-Based Clustering Methods
No ratings yet
Lecture 3. Partitioning-Based Clustering Methods
27 pages
Community Detection
No ratings yet
Community Detection
9 pages
K Means Clustering
No ratings yet
K Means Clustering
43 pages
Clustering
No ratings yet
Clustering
55 pages
ML Unit - IV
No ratings yet
ML Unit - IV
56 pages
K-Means Clustering Guide
No ratings yet
K-Means Clustering Guide
16 pages
Day 3
No ratings yet
Day 3
74 pages
Math 118: Mathematical Methods of Data Theory: Lecture 9: Graphs and Spectral Clustering
No ratings yet
Math 118: Mathematical Methods of Data Theory: Lecture 9: Graphs and Spectral Clustering
11 pages
K Means
No ratings yet
K Means
33 pages
Lecture 12 - Unsupervised Learning - Shoould Be Marged
No ratings yet
Lecture 12 - Unsupervised Learning - Shoould Be Marged
31 pages
Spectral Clustering: Eyal David Image Processing Seminar May 2008
No ratings yet
Spectral Clustering: Eyal David Image Processing Seminar May 2008
52 pages
Clustering
No ratings yet
Clustering
28 pages
Clustering Theory Applications and Algorithms
No ratings yet
Clustering Theory Applications and Algorithms
9 pages
Unit 4-Unsupervised Learning-K Means and Hierarchical Clustering
No ratings yet
Unit 4-Unsupervised Learning-K Means and Hierarchical Clustering
48 pages
Spectral Clustering
No ratings yet
Spectral Clustering
4 pages
EE769-10 Clustering
No ratings yet
EE769-10 Clustering
20 pages
Clustering Algorithm
No ratings yet
Clustering Algorithm
17 pages
Lecture 05
No ratings yet
Lecture 05
49 pages
ML Lecture14
No ratings yet
ML Lecture14
17 pages
Lect 6 - Clustering
No ratings yet
Lect 6 - Clustering
50 pages
Machine Learning With Kernel Methods
No ratings yet
Machine Learning With Kernel Methods
760 pages
Graph Clustering Algorithms Compared
No ratings yet
Graph Clustering Algorithms Compared
6 pages
Medical Imabmnge Analysis
No ratings yet
Medical Imabmnge Analysis
41 pages
A Comparative Study of Efficient Initialization Methods For The K-Means Clustering Algorithm
No ratings yet
A Comparative Study of Efficient Initialization Methods For The K-Means Clustering Algorithm
17 pages
ML Lecture06 Unsupervised Learning
No ratings yet
ML Lecture06 Unsupervised Learning
87 pages
AML-V New
No ratings yet
AML-V New
165 pages
Cluster Analysis Overview
No ratings yet
Cluster Analysis Overview
77 pages
K - Means Clustering and Related Algorithms: Ryan P. Adams COS 324 - Elements of Machine Learning Princeton University
No ratings yet
K - Means Clustering and Related Algorithms: Ryan P. Adams COS 324 - Elements of Machine Learning Princeton University
18 pages
Spectral Approach For Tabular and Graph Data Clustering
No ratings yet
Spectral Approach For Tabular and Graph Data Clustering
15 pages
Unit en Cluster Analysis Basicstest
No ratings yet
Unit en Cluster Analysis Basicstest
21 pages
Unsupervised Learning 1
No ratings yet
Unsupervised Learning 1
40 pages
Semi-Supervised Spectral Clustering Using Shared Nearest Neighbor For Data With Different Shape and Density
No ratings yet
Semi-Supervised Spectral Clustering Using Shared Nearest Neighbor For Data With Different Shape and Density
8 pages
Q & A Unit 3 - Clustering Methods
No ratings yet
Q & A Unit 3 - Clustering Methods
21 pages
Luxburg07 Tutorial 4488
No ratings yet
Luxburg07 Tutorial 4488
32 pages
2092 On Spectral Clustering Analysis and An Algorithm
No ratings yet
2092 On Spectral Clustering Analysis and An Algorithm
8 pages
Classes of Kernels For Machine Learning: A Statistics Perspective
No ratings yet
Classes of Kernels For Machine Learning: A Statistics Perspective
14 pages
SADMJ12
No ratings yet
SADMJ12
19 pages
Mixture of Gaussians and The EM Algorithm
No ratings yet
Mixture of Gaussians and The EM Algorithm
1 page
Lab 0 - Hypertext - Web Protocols Lab
No ratings yet
Lab 0 - Hypertext - Web Protocols Lab
24 pages
Homework Emilio
No ratings yet
Homework Emilio
2 pages
Final Project
No ratings yet
Final Project
6 pages
G.J.O. Jameson - Finding Carmichael Numbers
No ratings yet
G.J.O. Jameson - Finding Carmichael Numbers
12 pages
Lecture1 EE330
No ratings yet
Lecture1 EE330
17 pages
123 234 345 456
No ratings yet
123 234 345 456
5 pages
Stat Prob Q403.2 Constructing A Frequency Distribution Table
No ratings yet
Stat Prob Q403.2 Constructing A Frequency Distribution Table
30 pages
Paper 3+ijisae
No ratings yet
Paper 3+ijisae
15 pages
SMS-Chapter 1-Able-Baker's Problem
No ratings yet
SMS-Chapter 1-Able-Baker's Problem
6 pages
Model Complexity Control For Regression Using VC Generalization Bounds
No ratings yet
Model Complexity Control For Regression Using VC Generalization Bounds
15 pages
Linear Algebra & Complex Variables Exam
No ratings yet
Linear Algebra & Complex Variables Exam
1 page
DIP Image Denoising
No ratings yet
DIP Image Denoising
22 pages
(Ebook) Introduction To WinBUGS For Ecologists: Bayesian Approach To Regression, ANOVA, Mixed Models and Related Analyses by Marc Kery ISBN 9780123786050, 0123786053 PDF Download
100% (1)
(Ebook) Introduction To WinBUGS For Ecologists: Bayesian Approach To Regression, ANOVA, Mixed Models and Related Analyses by Marc Kery ISBN 9780123786050, 0123786053 PDF Download
58 pages
Ch1.6 Distribution of Discrete R.V.: Binomial Distribution
No ratings yet
Ch1.6 Distribution of Discrete R.V.: Binomial Distribution
4 pages
Graphing Parabolas in Vertex Form
No ratings yet
Graphing Parabolas in Vertex Form
7 pages
Deep Learning for Tech Enthusiasts
No ratings yet
Deep Learning for Tech Enthusiasts
30 pages
NCAA FBS Clustering & Grocery Store Analysis
No ratings yet
NCAA FBS Clustering & Grocery Store Analysis
4 pages
5 Multi-Attribute Utility Function
No ratings yet
5 Multi-Attribute Utility Function
53 pages
CS403 Mcqs Mid Term by Vu Topper RM-1-1
No ratings yet
CS403 Mcqs Mid Term by Vu Topper RM-1-1
56 pages
MyHill Nerode Theorem
No ratings yet
MyHill Nerode Theorem
14 pages
ML MID-1 QB With Answers
No ratings yet
ML MID-1 QB With Answers
10 pages
Math
No ratings yet
Math
25 pages
Binary Heaps for CS Students
No ratings yet
Binary Heaps for CS Students
10 pages
Confusion Matrix, Accuracy, Precision, Recall, F1 Score
No ratings yet
Confusion Matrix, Accuracy, Precision, Recall, F1 Score
1 page
Ec3354-Signals and Systems-858759966-Ss QB - 2023-06-10T151453.626
No ratings yet
Ec3354-Signals and Systems-858759966-Ss QB - 2023-06-10T151453.626
41 pages
Construction Waste Sorting
No ratings yet
Construction Waste Sorting
15 pages
Root Locus Design Method Cafsanchezdi
No ratings yet
Root Locus Design Method Cafsanchezdi
12 pages
The Design and Analysis of Computer Algorithms (Aho, Hopcroft & Ullman 1974-01-11) PDF
67% (6)
The Design and Analysis of Computer Algorithms (Aho, Hopcroft & Ullman 1974-01-11) PDF
479 pages
MA271 2025 Course Information
No ratings yet
MA271 2025 Course Information
1 page
PRP
No ratings yet
PRP
60 pages
Linear Programming Guide
No ratings yet
Linear Programming Guide
23 pages
Digital Certificate and Signature
100% (5)
Digital Certificate and Signature
51 pages
Advanced Microeconomics: Envelope Theorem
No ratings yet
Advanced Microeconomics: Envelope Theorem
3 pages
Simplex Method Lecture Notes
No ratings yet
Simplex Method Lecture Notes
36 pages