个人信息Personal Information
副教授
博士生导师
硕士生导师
主要任职:无
性别:男
毕业院校:大连理工大学
学位:博士
所在单位:软件学院、国际信息与软件学院
学科:软件工程
办公地点:软件学院综合楼417
联系方式:liangzhao@dlut.edu.cn
ICFS Clustering With Multiple Representatives for Large Data
点击次数:
论文类型:期刊论文
发表时间:2019-03-01
发表刊物:IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
收录刊物:SCIE、EI
卷号:30
期号:3
页面范围:728-738
ISSN号:2162-237X
关键字:Clustering by fast search (CFS); clusters adjustment; incremental clustering; large data; multiple representatives; objects assignment
摘要:With the prevailing development of Cyber-physicalsocial systems and Internet of Things, large-scale data have been collected consistently. Mining large data effectively and efficiently becomes increasingly important to promote the development and improve the service quality of these applications. Clustering, a popular data mining technique, aims to identify underlying patterns hidden in the data. Most clustering methods assume the static data, thus they are unfavorable for analyzing large, unbalanced dynamic data. In this paper, to address this concern, we focus on incremental clustering by extending the novel [ clustering by fast search (CFS) and find of density peaks] method to incrementally handle large-scale dynamic data. Specifically, we first discuss two challenges, i. e., assignment of new arriving objects and dynamic adjustment of clusters, in incremental CFS (ICFS) clustering. We then propose two ICFS clustering algorithms, ICFS with multiple representatives (ICFSMR) and the enhanced ICFSMR (E_ ICFSMR) to tackle the two challenges. In ICFSMR, we explore the convex hull theory to modify the representatives identified for each cluster. E_ ICFSMR improves the generality and effectiveness of ICFSMR by exploring one-time cluster adjustment strategy after integration of each data chunk. We evaluate the proposed methods with extensive experiments on four benchmark data sets, as well as the air quality and traffic monitoring time series, with comparisons to CFS and other three state-of-the-art incremental clustering methods. Experimental results demonstrate that the proposed methods outperform the compared methods in terms of both effectiveness and efficiency.