KEMBAR78
Cluster Analysis | PDF | Cluster Analysis | Analytics
0% found this document useful (0 votes)
52 views4 pages

Cluster Analysis

Cluster analysis is an unsupervised machine learning technique used to group similar data points together. It involves the following steps: 1. Select variables to use for clustering, such as demographics or purchase history. 2. Compute the distance between data points on the selected variables. Shorter distances indicate more similarity. 3. Apply a clustering algorithm like k-means, which assigns data points to k clusters based on distance to centroids, recomputing centroids until clusters stabilize.

Uploaded by

2ndYEAR ARQUERO
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views4 pages

Cluster Analysis

Cluster analysis is an unsupervised machine learning technique used to group similar data points together. It involves the following steps: 1. Select variables to use for clustering, such as demographics or purchase history. 2. Compute the distance between data points on the selected variables. Shorter distances indicate more similarity. 3. Apply a clustering algorithm like k-means, which assigns data points to k clusters based on distance to centroids, recomputing centroids until clusters stabilize.

Uploaded by

2ndYEAR ARQUERO
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

FUNDAMENTALS OF DESCRIPTIVE ANALYTICS (BUMA 30063)

11/27/2023
behaviors,
Cluster Analysis purchase
distance
metrics to
history,
Cluster Analysis identify
psychographics,
 also known as clustering, is a technique clusters of
etc., to identify
used in data analysis and machine learning data points
market
to group similar data points or objects that are closer
segments that
based on certain characteristics or to each other
have similar
features they possess. in feature
needs
 Segmentation space.
or behaviors.
- A way of organizing customers
into groups with similar traits,
product preferences, or
The auto insurance company, Geico, intends
expectations.
to tailor its insurance options. Your objective
 Clustering
is to understand the key factors customers
- A broader analytical technique
prioritize in their insurance provider. You've
used in various domains to
created a survey for current Geico customers,
identify patterns and groupings in
aiming to gauge their perception of the
data, without a specific marketing
importance of two aspects when selecting
or business context.
auto insurance: cost savings on premiums and
the availability of a local agent.
Segmentation vs. Clustering
Usage Used in a
broader range
of fields
Primarily used including data
in marketing analysis,
and business machine
strategy. learning,
image
processing,
etc.
Purpose Divides a
market or
customer base
into Groups data
distinct and points into
relatively clusters based
homogeneous on
segments similarities in
based on their features
Likert-Scale
various or attributes.
characteristics,
A Likert-type scale, which takes its name from
behaviors, or
psychologist Rensis Likert, is a frequently
preferences.
employed psychometric measurement tool for
Method Involves Utilizes
evaluating individuals attitudes, opinions,
analyzing mathematical
perceptions, or behaviors. The approach
customer algorithms
entails providing respondents with a
demographics, and
statement or multiple statements and
FUNDAMENTALS OF DESCRIPTIVE ANALYTICS (BUMA 30063)
11/27/2023
prompting them to express their degree of - managers can recognize segments
agreement or disagreement using a in the marketplace. Ex:
predetermined range of response choices. Demographics

Cluster Analysis Goal  Sustainability


The primary inputs for cluster analysis are - Criterion that is satisfied if the
measures of similarity between customers, segments represent a large
such as (a) correlation coefficients and (b) enough portion of the market to
distance measures. ensure that marketing programs
can be customized profitably.
 Correlation coefficients
- Measure the association between  Accessibility
two variables. They range from - The extent to which managers can
−1, which indicates a negative reach the identified segments
association like that between through marketing campaigns is
sales and price (when prices go captured by the accessibility
up, sales go down, and vice versa), criterion.
to 1, which indicates a positive
association like that between  Actionability
sales and advertising (when - Refers to whether customers in a
advertising goes up, so do sales— certain segment and the
at least, that’s the intention). Zero marketing mix necessary to satisfy
implies no linear association their needs are consistent with
between the variables. the goals and core competencies
 Distance measures of the firm. Ex: Channel of comm
- are measures of the difference like FB Chatbot, Emails, Physical
between two customers on the Meetings
variables used for segmentation.
2.) Step 2: Compute Distance
Steps in Cluster Analysis  The main input into any cluster-analysis
1. Formulate the problem. procedure is a measure of distance
2. Compute distance between customers between individuals who are being
along the selected variables. clustered. The objective of a distance
3. Apply the clustering procedure to the measure is to quantify the difference
distance measures. between two individuals on the variables
4. Decide on the number of clusters. you are using for the segmentation.
5. Profile clusters.  A shorter distance between two
individuals implies that they have similar
preferences on the segmentation
1.) Step 1: Formulate the problem. variables, and may be in the same cluster.
 The first step is to select the variables that  A longer distance implies that they have
you wish to use as the basis for clustering. dissimilar preferences and may be in
Those variables determine how well your different clusters.
segmentation works in terms of
marketing.
What is the Euclidean distance between Joe
 Identifiability and Sam?
- Refers to the extent to which
FUNDAMENTALS OF DESCRIPTIVE ANALYTICS (BUMA 30063)
11/27/2023
customers to clusters has not
changed over multiple iterations.

3.) Step 3: Apply the Clustering Procedure


 K-means clustering is one of the more
popular algorithms used for clustering,
and it is gaining even more popularity
with the growth of machine learning. It
belongs to the nonhierarchical class of
clustering algorithms, meaning the
clustering algorithm does not impose a
hierarchical structure on the variables
used for the segmentation.
 For K-means clustering, the manager has
to specify the number of clusters, k,
required before starting the clustering
algorithm. The basic algorithm for K-
means clustering is as follows:
1. Choose the number of clusters, k.
2. Generate k random points as
cluster centroids.
3. Assign each point to the nearest
cluster centroid.
4. Recompute the new cluster
centroid.
5. Repeat the two previous steps until
some convergence criterion is met.
Usually the convergence criterion is
met when the assignment of
FUNDAMENTALS OF DESCRIPTIVE ANALYTICS (BUMA 30063)
11/27/2023

REPEAT UNTIL TALLY!!!

4.) Step 4: Decide on Number of Clusters


 A commonly used method to
determine the optimal k is the “elbow
criterion.” The elbow criterion means
finding a number of clusters such that
adding another cluster does not add
sufficient information. In other words,
this is the point at which another
cluster would add complexity to the
marketing process that is not justified
by the returns from customizing
marketing to the additional segment.
 In simpler terms, the "elbow " is
where the plot starts to bend,
indicating the point where increasing
the number of clusters does not
significantly improve the model's
performance or fit to the data.
5.) STEP 5: Profile Clusters
 Profiling clusters means describing
them in terms of the variables used for
clustering—or in terms of additional
data, such as demographics. This way,
a company can extrapolate from data
on its own customers to find the most
likely potential customers.

You might also like