孟军

个人信息Personal Information

教授

博士生导师

硕士生导师

性别:女

毕业院校:大连理工大学

学位:博士

所在单位:计算机科学与技术学院

学科:计算机应用技术. 计算机软件与理论

扫描关注

论文成果

当前位置: 中文主页 >> 科学研究 >> 论文成果

Gene selection using rough set based on neighborhood for the analysis of plant stress response

点击次数:

论文类型:期刊论文

发表时间:2014-12-01

发表刊物:APPLIED SOFT COMPUTING

收录刊物:SCIE、EI、Scopus

卷号:25

期号:1

页面范围:51-63

ISSN号:1568-4946

关键字:Gene selection; Rough set based on neighborhood; Threshold optimization; Plant stress response

摘要:Gene selection and sample classification based on gene expression data are important research trends in bioinformatics. It is very difficult to select significant genes closely related to classification because of the high dimension and small sample size of gene expression data. Rough set based on neighborhood has been successfully applied to gene selection, as it selects attributes without redundancy and deals with numerical attributes directly. Construction of neighborhoods, approximation operators and attribute reduction algorithm are three key components in this gene selection approach. In this study, a novel neighborhood named intersection neighborhood for numerical data was defined. The performances of two kinds of approximation operators were compared on gene expression data. A significant gene selection algorithm, which was applied to the analysis of plant stress response, was proposed by using positive region and gene ranking, and then this algorithm with thresholds optimization for intersection neighborhood was extended. The performance of the proposed algorithm, along with a comparison with other related methods, classical algorithms and rough set methods, was analyzed. The results of experiments on four data sets showed that intersection neighborhood was more flexible to adapt to the data with various structure, and approximation operator based on elementary set was more suitable for this application than that based on element. That was to say that the proposed algorithms were effective, as they could select significant gene subsets without redundancy and achieve high classification accuracy. (C) 2014 Elsevier B.V. All rights reserved.