KEMBAR78
Unit IV Unsupervised Learning 73 81 | PDF | Cluster Analysis | Theoretical Computer Science
0% found this document useful (0 votes)
7 views9 pages

Unit IV Unsupervised Learning 73 81

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a clustering algorithm that identifies dense regions in data without requiring the number of clusters to be specified in advance. It classifies points into core points, border points, and noise points based on their density and proximity to other points. The algorithm involves selecting parameters, examining neighborhoods, expanding clusters, and finalizing the classification of points after processing the entire dataset.

Uploaded by

ap6859343
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views9 pages

Unit IV Unsupervised Learning 73 81

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a clustering algorithm that identifies dense regions in data without requiring the number of clusters to be specified in advance. It classifies points into core points, border points, and noise points based on their density and proximity to other points. The algorithm involves selecting parameters, examining neighborhoods, expanding clusters, and finalizing the classification of points after processing the entire dataset.

Uploaded by

ap6859343
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

DBSCAN ALGORITHM

• INTRODUCTION

• DBSCAN, which stands for Density-Based Spatial Clustering of


Applications with Noise, is a powerful clustering algorithm
that groups points that are closely packed together in data
space.

• Unlike some other clustering algorithms, DBSCAN doesn't


require you to specify the number of clusters beforehand.

• The algorithm works by defining clusters as dense regions


separated by regions of lower density.
73
DBSCAN ALGORITHM
• KEY CONCEPTS

• Core Points: These are points that have at least a minimum


number of other points (MinPts) within a specified distance
(ε or epsilon).

• Border Points: These are points that are within the ε distance
of a core point but don't have MinPts neighbors themselves.

• Noise Points: These are points that are neither core points
nor border points. They're not close enough to any cluster to
be included.
74
DBSCAN ALGORITHM

75
DBSCAN ALGORITHM
• PARAMETERS IN DBSCAN

• ε (epsilon): The maximum distance between two points for


them to be considered as neighbors.

• MinPts: The minimum number of points required to form a


dense region.

76
DBSCAN ALGORITHM

77
DBSCAN ALGORITHM
• WORKING OF DBSCAN

• STEP-1: Parameter Selection


– Choose ε (epsilon): The maximum distance between two points for
them to be considered as neighbors.

– Choose MinPts: The minimum number of points required to form a


dense region.

• STEP-2: Select a Starting Point


– The algorithm starts with an arbitrary unvisited point in the dataset.

78
DBSCAN ALGORITHM
• STEP-3: Examine the Neighborhood
– It retrieves all points within the ε distance of the starting point.

– If the number of neighboring points is less than MinPts, the point is


labeled as noise (for now).

– If there are at least MinPts points within ε distance, the point is


marked as a core point, and a new cluster is formed.

79
DBSCAN ALGORITHM
• STEP-4: Expand the Cluster
– All the neighbors of the core point are added to the cluster.

– For each of these neighbors:

• If it's a core point, its neighbors are added to the cluster


recursively.

• If it's not a core point, it's marked as a border point, and the
expansion stops.

80
DBSCAN ALGORITHM
• STEP-5: Repeat the Process
– The algorithm moves to the next unvisited point in the dataset.

– Steps 3-4 are repeated until all points have been visited.

• STEP-6: Finalize Clusters


– After all points have been processed, the algorithm identifies all
clusters.

– Points initially labeled as noise might now be border points if they're


within ε distance of a core point.

• STEP-7: Handling Noise


– Any points not belonging to any cluster remain classified as noise. 81

You might also like