Hits:
Indexed by:会议论文
Date of Publication:2016-01-01
Included Journals:CPCI-S、SCIE
Page Number:619-624
Key Words:feature selection; classification; effective range; R-value
Abstract:In systems biology, filtering the discriminative features from complex high-dimensional data is a crucial issue. This paper proposes a feature selection algorithm based on feature overlapping and group overlapping (FS-FOGO) to calculate the feature importance. FS-FOGO weighs feature from two aspects: overlapping degree based on the ratio of overlapping area on the effective range of each class and the overlapping degree based on the proportion of heterogeneous samples in every sample's nearest neighbors. To show the validation of FS-FOGO, it is compared with effective range based gene selection (ERGS), which calculates the feature weights based on overlapping area of the effective range, on six public biological data sets and one serum metabolomics data set about liver disease. Naive Bayes and Support Vector Machine are used as classifiers, respectively. The experiment results show that the top ranked features by FS-FOGO are more discriminative and get higher classification accuracy rates than those by ERGS in most cases. And in the metabolomics data, the top ranked metabolites by FS-FOGO could separate different liver diseases well.