教授 博士生导师 硕士生导师
性别: 男
毕业院校: 中国科技大学
学位: 博士
所在单位: 软件学院、国际信息与软件学院
学科: 计算机应用技术. 软件工程
电子邮箱: xczhang@dlut.edu.cn
开通时间: ..
最后更新时间: ..
点击次数:
论文类型: 会议论文
发表时间: 2010-12-14
收录刊物: EI、Scopus
页面范围: 629-638
摘要: Clusters are hidden in subspaces of high dimensional data, i.e., only a subset of features is relevant for each cluster. Subspace clustering is challenging since the search for the relevant features of each cluster and the detection of the final clusters are circular dependent and should be solved simultaneously. In this paper, we point out that feature correlation and distance divergence are important to subspace clustering, but both have not been considered in previous works. Feature correlation groups correlated features independently thus helps to reduce the search space for the relevant features search problem. Distance divergence distinguishes distances on different dimensions and helps to find the final clusters accurately. We tackle the two problems with the aid of a small amount domain knowledge in the form of must-links and cannot-links. We then devise a semi-supervised subspace clustering algorithm CDCDD. CDCDD integrates our solutions of the feature correlation and distance divergence problems, and uses an adaptive dimension voting scheme, which is derived from a previous unsupervised subspace clustering algorithm FINDIT. Experimental results on both synthetic data sets and real data sets show that the proposed CDCDD algorithm outperforms FINDIT in terms of accuracy, and outperforms the other constraint based algorithm SCMINER in terms of both accuracy and efficiency. ? 2010 IEEE.