Density-Based Clustering
Model clusters as dense regions in the data space, separated by sparse regions. Does not attempt to assign every object to a cluster; many may be left out as “noise”. 将集群建模为数据空间中由稀疏区域分隔的密集区域。不尝试将每个对象分配给一个集群;许多可能被遗漏为“噪音”。
- Major features:- Discovers clusters of arbitrary shape   发现任意形状的簇- Partitioning and hierarchical methods are designed to find spherical-shaped (convex) clusters 分区和分层方法被设计来寻找球形(凸形)聚类
 
- Handles noise 处理噪音
- One scan through the data only 只扫描一次数据
- Needs parameters to define threshold dense-ness (but not for the number of clusters) 需要参数来定义阈值密集度(但不是针对集群的数量)
 
- Discovers clusters of arbitrary shape   发现任意形状的簇
DBSCAN (Density-Based Spatial Clustering of Applications with Noise) 带噪声应用的基于密度的空间聚类
- Density of an object  对象o的密度: the number of objects close to 对象o的密度: the number of objects close to 接近对象o的对象数量 接近对象o的对象数量
- Core objects(核心对象): Objects that have a dense neighbourhood 具有密集邻域的对象
- DBSCAN: connects core objects and their neighbourhoods to form dense regions as cluster 连接核心对象和它们的邻居,形成密集的区域作为集群
- Two parameters:
 : { : { | | } }
- Directly density-reachable: A point  is directly density-reachable from a core point is directly density-reachable from a core point if if is within the is within the -neighbourhood of -neighbourhood of 如果q点是核心点,且p点在q点的 如果q点是核心点,且p点在q点的 -neighbourhood中, 则p点是q的直接密度可达。 -neighbourhood中, 则p点是q的直接密度可达。- By definition, no points are directly density-reachable from a non-core point.
 
- Density-reachable:  is density-reachable from a core point is density-reachable from a core point if there is a chain of objects if there is a chain of objects such that such that , , and and is directly density-reachable from is directly density-reachable from with respect to with respect to and _MinPts. p和q是 密度可达的条件是:对象p和对象q 可以有锁链相连,其中锁链的相邻对象可 直接密度可达,且遵守最大半径、最小密度。 and _MinPts. p和q是 密度可达的条件是:对象p和对象q 可以有锁链相连,其中锁链的相邻对象可 直接密度可达,且遵守最大半径、最小密度。
- Density-connected: Two objects  are density-connected if are density-connected if
Definition of Cluster in DBSCAN
- All points within the cluster  are mutually density-connected,聚类中的所有点都互相密度连接 and are mutually density-connected,聚类中的所有点都互相密度连接 and
- There is no point outside  that is density-connected to a point inside that is density-connected to a point inside . 没有聚类以外的点连接到聚类之内的点。 . 没有聚类以外的点连接到聚类之内的点。
Example of density-reachable and density-connected:
> Let  be the radius of the circles and MinPts 3.
 be the radius of the circles and MinPts 3.  
>  are core objects.
 are core objects. 
> Object  is directly density-reachable from
 is directly density-reachable from  .
.
> Object  is directly density-reachable from
 is directly density-reachable from  and vice versa.
 and vice versa.  
> Object  is density-reachable from
 is density-reachable from  because
 because  is directly density reachable from
 is directly density reachable from  and
 and  is directly density-reachable from
 is directly density-reachable from  . However,
. However,  is not density reachable from
 is not density reachable from  because
 because  is not a core object.
 is not a core object.
>  and
 and  are density-reachable from
 are density-reachable from 
>  is density-reachable from
 is density-reachable from  .
.
>  ,
,  , and
, and  are all density-connected.
 are all density-connected.
DBSCAN algorithm
 DBSCAN worked example
DBSCAN worked example

















 
                         
                                

