孟军

个人信息Personal Information

教授

博士生导师

硕士生导师

性别:女

毕业院校:大连理工大学

学位:博士

所在单位:计算机科学与技术学院

学科:计算机应用技术. 计算机软件与理论

办公地点:创新园大厦A0816

电子邮箱:mengjun@dlut.edu.cn

扫描关注

论文成果

当前位置: 中文主页 >> 科学研究 >> 论文成果

Prediction of LncRNA by Using Muitiple Feature Information Fusion and Feature Selection Technique

点击次数:

论文类型:会议论文

发表时间:2018-01-01

收录刊物:CPCI-S

卷号:10955

页面范围:318-329

关键字:Ensemble feature selection; Maximum correlation minimum redundancy; Pseudo nucleotides features; Classification; LncRNA

摘要:Recent genomic studies suggest that long non-coding RNAs (lncRNAs) play an important role in regulation of plant growth. Therefore, it is important to find more plant lncRNAs and predict their functions. This paper presents an improved maximum correlation minimum redundancy method for lncRNAs recognition. Sequence feature, secondary structural feature and functional feature such as pseudo-nucleotides feature which is based on the physical and chemical properties between dimers dinucleotide of related RNA have been extracted. Then, using maximum correlation minimum redundancy method to integrate a variety of feature selection methods such as Pearson correlation coefficient, information gain, relief algorithm and random forest for feature selection. Based on the selected superior feature subset, the classification model is established by SVM. Experimental results on Arabidopsis sequence dataset show that pseudo-nucleotides feature reflects information of different RNA sequences and the classification model constructed according to the proposed method can be more accurate than other methods on identification of plant lncRNAs.