KEMBAR78
Graph Clustering and cluster | DOC
Compose by Adil
Cluster Analysis
The process of dividing a set of input data into possibly
overlapping, subsets, where elements in each subset are
considered related by some similarity measure
Similarity
A cluster is a set of entities which are alike,and entities from
different clusters are not alike.
what is cluster?
Clustering and clusters are not synonymous. A clustering
is an entire collection of clusters; a cluster on the other
hand is just one part of the entire picture. There are
different types of clusters and also different types of
clustering.
Types of clusters:
Well-separated clusters
Center-based clusters
Contiguous clusters
Density-based clusters
Property or Conceptual.
Well separated cluster:
A cluster is a set of points such that any point in a cluster
is closer (or more similar) to every other point in the
cluster than to any point not in the cluster.
3 well-separated clusters.
Center based cluster:
A cluster is a set of objects such that an object in a
cluster is closer (more similar) to the “center” of a
cluster, than to the center of any other cluster
The center of a cluster is often a centroid, the average of
all the points in the cluster, or a medoid, the most
“representative” point of a cluster
Contiguous Cluster (Nearest neighbor or
Transitive):
A cluster is a set of points such that a point in a cluster is
closer (or more similar) to one or more other points in
the cluster than to any point not in the cluster.
8 contiguous clusters.
Density-based:
A cluster is a dense region of points, which is separated
by low-density regions, from other regions of high
density.
Used when the clusters are irregular or intertwined, and
when noise and outliers are present.
6 density-based clusters.
Shared Property or Conceptual Clusters:
Finds clusters that share some common property or
represent a particular concept.
Types of Clusterings:
These are types of clusterings.
1:hierarchical clustering.
2:partitional clustering.
3:overlapping clustering.
4:Complete clustering.
5:partial clustering.
6:fuzzy clustering.
Partitional clustering:
It is simply the division of set of data objects into non
overlapping subsets such that each data object is only in
one subset.
Hierarchical clustering:
If we permit clusters to have sub clusters then we obtain
hierarchical clustering which is a set of nested clusters
that are organized as a tree.
Each node(cluster) in a tree is the union of its children
(subcluster) and the root of the tree is the cluster
containing all the objects.
Overlapping clustering:
overlapping allows data objects to be grouped in 2 or
more clusters. A real world example would be the
breakdown of personnel at a school. Overlapping
clustering would allow a student to also be grouped as an
employee while exclusive clustering would demand that
the person must choose the one that is more important
Fuzzy clustering:
In fuzzy clustering every data object belongs to every
cluster, I guess you can describe fuzzy clustering as an
extreme version of overlapping, the major difference is
that the data objects has a membership weight that is
between 0 to 1 where 0 means it does not belong to a
given cluster and 1 means it absolutely belongs to the
cluster. Fuzzy clustering is also known as probabilistic
clustering.
Complete clustering:
This separation is based on the characteristic that
requires all data objects to be grouped. A complete
clustering assigns every object to a cluster.
Partial clustering:
Partial clustering on the other hand allows some
data objects to left alone.
Applications:
Cluster analysis has a vital role in numerous fields
ranging from biology to machine learning. Its
application depends on whether clustering is used
as a stepping stool and a basis for future analysis or
as a tool for understanding.
Understanding: When it comes to data analysis for
the purpose of understanding the dataset, cluster
analysis is the study of techniques for automatically
finding classes because every cluster is a potential
class just needed a class label. Applications for this
use of clustering exist in the fields of biology when it
comes to taxonomy and grouping genetic
information, information retrieval, climate to help
find patterns in the atmosphere and ocean. In the
field of psychology and medicine, clustering is used
for diagnosis of diseases and in business it is used
to segment customers into small groups that can
later be targeted for future marketing activities.
Utility: Cluster analysis can also be used as the
basis for other data analysis or processing
techniques, in this context, cluster analysis is similar
to visualization it is the study of techniques for
finding the most representative clusters.
Applications for this use of clustering include
summarization which uses clustering to avoid the
curse of dimensionality and apply the algorithm to
cluster prototypes. Clustering can also be used to
efficiently find nearest neighbors.
curse of dimensionality and apply the algorithm to
cluster prototypes. Clustering can also be used to
efficiently find nearest neighbors.

Graph Clustering and cluster

  • 1.
    Compose by Adil ClusterAnalysis The process of dividing a set of input data into possibly overlapping, subsets, where elements in each subset are considered related by some similarity measure Similarity A cluster is a set of entities which are alike,and entities from different clusters are not alike.
  • 2.
    what is cluster? Clusteringand clusters are not synonymous. A clustering is an entire collection of clusters; a cluster on the other hand is just one part of the entire picture. There are different types of clusters and also different types of clustering. Types of clusters: Well-separated clusters Center-based clusters Contiguous clusters Density-based clusters Property or Conceptual. Well separated cluster: A cluster is a set of points such that any point in a cluster is closer (or more similar) to every other point in the cluster than to any point not in the cluster.
  • 3.
    3 well-separated clusters. Centerbased cluster: A cluster is a set of objects such that an object in a cluster is closer (more similar) to the “center” of a cluster, than to the center of any other cluster The center of a cluster is often a centroid, the average of all the points in the cluster, or a medoid, the most “representative” point of a cluster Contiguous Cluster (Nearest neighbor or Transitive): A cluster is a set of points such that a point in a cluster is closer (or more similar) to one or more other points in
  • 4.
    the cluster thanto any point not in the cluster. 8 contiguous clusters. Density-based: A cluster is a dense region of points, which is separated by low-density regions, from other regions of high density. Used when the clusters are irregular or intertwined, and when noise and outliers are present.
  • 5.
    6 density-based clusters. SharedProperty or Conceptual Clusters: Finds clusters that share some common property or represent a particular concept. Types of Clusterings: These are types of clusterings. 1:hierarchical clustering. 2:partitional clustering. 3:overlapping clustering. 4:Complete clustering. 5:partial clustering. 6:fuzzy clustering. Partitional clustering: It is simply the division of set of data objects into non
  • 6.
    overlapping subsets suchthat each data object is only in one subset. Hierarchical clustering: If we permit clusters to have sub clusters then we obtain hierarchical clustering which is a set of nested clusters that are organized as a tree. Each node(cluster) in a tree is the union of its children (subcluster) and the root of the tree is the cluster containing all the objects. Overlapping clustering: overlapping allows data objects to be grouped in 2 or more clusters. A real world example would be the breakdown of personnel at a school. Overlapping clustering would allow a student to also be grouped as an employee while exclusive clustering would demand that the person must choose the one that is more important Fuzzy clustering: In fuzzy clustering every data object belongs to every cluster, I guess you can describe fuzzy clustering as an extreme version of overlapping, the major difference is
  • 7.
    that the dataobjects has a membership weight that is between 0 to 1 where 0 means it does not belong to a given cluster and 1 means it absolutely belongs to the cluster. Fuzzy clustering is also known as probabilistic clustering. Complete clustering: This separation is based on the characteristic that requires all data objects to be grouped. A complete clustering assigns every object to a cluster. Partial clustering: Partial clustering on the other hand allows some data objects to left alone. Applications: Cluster analysis has a vital role in numerous fields ranging from biology to machine learning. Its application depends on whether clustering is used as a stepping stool and a basis for future analysis or
  • 8.
    as a toolfor understanding. Understanding: When it comes to data analysis for the purpose of understanding the dataset, cluster analysis is the study of techniques for automatically finding classes because every cluster is a potential class just needed a class label. Applications for this use of clustering exist in the fields of biology when it comes to taxonomy and grouping genetic information, information retrieval, climate to help find patterns in the atmosphere and ocean. In the field of psychology and medicine, clustering is used for diagnosis of diseases and in business it is used to segment customers into small groups that can later be targeted for future marketing activities. Utility: Cluster analysis can also be used as the basis for other data analysis or processing techniques, in this context, cluster analysis is similar to visualization it is the study of techniques for finding the most representative clusters. Applications for this use of clustering include summarization which uses clustering to avoid the
  • 9.
    curse of dimensionalityand apply the algorithm to cluster prototypes. Clustering can also be used to efficiently find nearest neighbors.
  • 10.
    curse of dimensionalityand apply the algorithm to cluster prototypes. Clustering can also be used to efficiently find nearest neighbors.