Hits:
Indexed by:期刊论文
Date of Publication:2015-03-15
Journal:Journal of Computational Information Systems
Included Journals:EI、Scopus
Volume:11
Issue:6
Page Number:2139-2146
ISSN No.:15539105
Abstract:Missing data handling is a challenging issue often dealt with in data mining and pattern classification. In this paper, a fuzzy c-means clustering algorithm based on pseudo-nearest-neighbor intervals for incomplete data is given. The data are first completed using the pseudo-nearest-neighbor intervals approach, then the data set can be clustered based on the fuzzy c-means algorithm for interval-valued data. The proposed algorithm estimates the missing attribute values without normalization, thus captures the essence of pattern similarities in the original untouched data set. Additionally, the pseudo-nearest-neighbor intervals representation takes account of implicit uncertainly of missing attribute values, and considers the angle between incomplete data and other data as well. Results on several incomplete data sets demonstrate the effectiveness of the proposed algorithm. Copyright ? 2015 Binary Information Press.