location: Current position: Home >> Scientific Research >> Paper Publications

Fast affinity propagation clustering based on incomplete similarity matrix

Hits:

Indexed by:期刊论文

Date of Publication:2017-06-01

Journal:KNOWLEDGE AND INFORMATION SYSTEMS

Included Journals:SCIE

Volume:51

Issue:3

Page Number:941-963

ISSN No.:0219-1377

Key Words:Exemplar-based clustering; Affinity propagation; Incomplete similarity matrix; Fast algorithm

Abstract:Affinity propagation (AP) is a recently proposed clustering algorithm, which has been successful used in a lot of practical problems. Although effective in finding meaningful clustering solutions, a key disadvantage of AP is its efficiency, which has become the bottleneck when applying AP for large-scale problems. In the literature, most of the methods proposed to improve the efficiency of AP are based on implementing the message-passing on a sparse similarity matrix, while neither the decline in effectiveness nor the improvement in efficiency is theoretically analyzed. In this paper, we propose a two-stage fast affinity propagation (FastAP) algorithm. Different from previous work, the scale of the similarity matrix is first compressed by selecting only potential exemplars, then further reduced by sparseness according to k nearest neighbors. More importantly, we provide theoretical analysis, based on which the improvement of efficiency in our method is controllable with guaranteed clustering performance. In experiments, two synthetic data sets, seven publicly available data sets, and two real-world streaming data sets are used to evaluate the proposed method. The results demonstrate that FastAP can achieve comparable clustering performances with the original AP algorithm, while the computational efficiency has been improved with a several-fold speed-up on small data sets and a dozens-of-fold on larger-scale data sets.

Pre One:基于知识元的文献挖掘研究——以粤海关文献资料为例

Next One:粤海关文献研究:基于知识图谱的可视化分析