刘馨月

个人信息Personal Information

副教授

博士生导师

硕士生导师

性别:女

毕业院校:大连理工大学

学位:博士

所在单位:软件学院、国际信息与软件学院

学科:计算机软件与理论. 软件工程

电子邮箱:xyliu@dlut.edu.cn

扫描关注

论文成果

当前位置: 中文主页 >> 科学研究 >> 论文成果

Integrated constraint based clustering algorithm for high dimensional data

点击次数:

论文类型:期刊论文

发表时间:2014-10-22

发表刊物:NEUROCOMPUTING

收录刊物:SCIE、EI、Scopus

卷号:142

期号:,SI

页面范围:478-485

ISSN号:0925-2312

关键字:High dimensional data; Subspace clustering; Constraint based clustering

摘要:Dimension selection, dimension weighting and data assignment are three circular dependent essential tasks for high dimensional data clustering and each such task is challenging. To meet the challenge of high dimensional data clustering, constraints have been employed in several previous works. However, these constraint based algorithms use constraints to help accomplish only one of the three essential tasks. In this paper, we propose an integrated constraint based clustering (ICBC) algorithm for high dimensional data, which exploits constraints to accomplish all the three essential tasks. Firstly we generalize the dimension selection technique of CDCDD algorithm such that dimension selection and dimension weighting could be accomplished simultaneously. Then we propose a novel constraint based data assignment method which assigns all the data points to their corresponding clusters based on the selected dimensions and dimension weights. Finally we use an optimization technique to iteratively refine the initial dimension weights and centroids, and reassign data accordingly till convergence. Experimental results on both synthetic data sets and real data sets show that our proposed ICBC algorithm outperforms typical unsupervised algorithms and other constraint based algorithms in terms of accuracy. ICBC also outperforms the other algorithms that implement dimension selection in terms of efficiency and scalability. (C) 2014 Elsevier B.V. All rights reserved.