Hits:
Indexed by:会议论文
Date of Publication:2018-03-15
Included Journals:EI
Volume:10758 LNAI
Page Number:87-100
Abstract:Density Peaks based Clustering (DPC) is a recently proposed clustering algorithm, which is realized by first selecting some representative objects named density peaks, then assigning each remaining objects to one of the density peaks. Different from classical centroid-based clustering algorithms, DPC can find arbitrary-shaped clusters, and no predefined initial centroid set is required. However, a key disadvantage of the DPC lies in its computational complexity. DPC requires computation of two indicators for each data object. When the number of data increases, the computational complexity of DPC grows dramatically, which limits the application in many real-world problems. For example, when we use the taxi drop-offs to analyze the human mobility, DPC cannot be directly used due to the large number of taxi drop-off records. This paper proposes an efficient DPC algorithm based on grid density. By partitioning the effective data space into a desirable number of grids, two indicators of each grid are computed, as the number of grids is much smaller than that of data objects, a great amount of computational time and memory space can be saved. In experiments, we compare Grid-DPC with K-centers, affinity propagation and DPC on both synthetic and publicly available datasets. Results demonstrate that Grid-DPC can achieve comparable clustering performance with the classical DPC. We also employee Grid-DPC to analyze large-scale taxi records of a city in China and of New York Manhattan area. The discovered human mobility zones have great potential in urban planning and can help taxi drivers make better routing decisions. © 2018, Springer International Publishing AG, part of Springer Nature.