OgMKwKCHkOjOiRycyOmk1lN8dWXn2QKCxMnBJu60S5SDZgyGUqkibbgI2Q8K
Current position: Home >> Scientific Research >> Paper Publications

A hybrid genetic algorithm-fuzzy c-means approach for incomplete data clustering based on nearest-neighbor intervals

Release Time:2019-03-09  Hits:

Indexed by: Journal Article

Date of Publication: 2013-10-01

Journal: SOFT COMPUTING

Included Journals: Scopus、EI、SCIE

Volume: 17

Issue: 10

Page Number: 1787-1796

ISSN: 1432-7643

Key Words: Fuzzy clustering; Hybrid approach; Incomplete data; Nearest-neighbor interval

Abstract: Incomplete data are often encountered in data sets used in clustering problems, and inappropriate treatment of incomplete data can significantly degrade the clustering performance. In view of the uncertainty of missing attributes, we put forward an interval representation of missing attributes based on nearest-neighbor information, named nearest-neighbor interval, and a hybrid approach utilizing genetic algorithm and fuzzy c-means is presented for incomplete data clustering. The overall algorithm is within the genetic algorithm framework, which searches for appropriate imputations of missing attributes in corresponding nearest-neighbor intervals to recover the incomplete data set, and hybridizes fuzzy c-means to perform clustering analysis and provide fitness metric for genetic optimization simultaneously. Several experimental results on a set of real-life data sets are presented to demonstrate the better clustering performance of our hybrid approach over the compared methods.

Prev One:A Novel Integrated Method for Human Multiplex Protein Subcellular Localization Prediction

Next One:Identifying the singleplex and multiplex proteins based on transductive learning for protein subcellular localization prediction