Hits:
Indexed by:会议论文
Date of Publication:2016-01-01
Included Journals:CPCI-S
Volume:9652
Page Number:271-282
Abstract:Both uncertain data and high-dimensional data pose huge challenges to traditional clustering algorithms. It is even more challenging for clustering high dimensional uncertain data and there are few such algorithms. In this paper, based on the classical FINDIT subspace clustering algorithm for high dimensional data, we propose a constraint based semi-supervised subspace clustering algorithm for high dimensional uncertain data, UFINDIT. We extend both the distance functions and dimension voting rules of FINDIT to deal with high dimensional uncertain data. Since the soundness criteria of FINDIT fails for uncertain data, we introduce constraints to solve the problem. We also use the constraints to improve FINDIT in eliminating parameters' effect on the process of merging medoids. Furthermore, we propose some methods such as sampling to get an more efficient algorithm. Experimental results on synthetic and real data sets show that our proposed UFINDIT algorithm outperforms the existing subspace clustering algorithm for uncertain data.