location: Current position: Home >> Scientific Research >> Paper Publications

Gene selection using rough set based on neighborhood for the analysis of plant stress response

Hits:

Indexed by:期刊论文

Date of Publication:2014-12-01

Journal:APPLIED SOFT COMPUTING

Included Journals:SCIE、EI、Scopus

Volume:25

Issue:1

Page Number:51-63

ISSN No.:1568-4946

Key Words:Gene selection; Rough set based on neighborhood; Threshold optimization; Plant stress response

Abstract:Gene selection and sample classification based on gene expression data are important research trends in bioinformatics. It is very difficult to select significant genes closely related to classification because of the high dimension and small sample size of gene expression data. Rough set based on neighborhood has been successfully applied to gene selection, as it selects attributes without redundancy and deals with numerical attributes directly. Construction of neighborhoods, approximation operators and attribute reduction algorithm are three key components in this gene selection approach. In this study, a novel neighborhood named intersection neighborhood for numerical data was defined. The performances of two kinds of approximation operators were compared on gene expression data. A significant gene selection algorithm, which was applied to the analysis of plant stress response, was proposed by using positive region and gene ranking, and then this algorithm with thresholds optimization for intersection neighborhood was extended. The performance of the proposed algorithm, along with a comparison with other related methods, classical algorithms and rough set methods, was analyzed. The results of experiments on four data sets showed that intersection neighborhood was more flexible to adapt to the data with various structure, and approximation operator based on elementary set was more suitable for this application than that based on element. That was to say that the proposed algorithms were effective, as they could select significant gene subsets without redundancy and achieve high classification accuracy. (C) 2014 Elsevier B.V. All rights reserved.

Pre One:番茄MIR398基因的克隆及其在烟草中的表达分析

Next One:SpMYB overexpression in tobacco plants leads to altered abiotic and biotic stress responses