• 更多栏目

    顾宏

    • 教授     博士生导师   硕士生导师
    • 性别:男
    • 毕业院校:浙江大学
    • 学位:博士
    • 所在单位:控制科学与工程学院
    • 学科:模式识别与智能系统
    • 办公地点:创新园大厦B0715
    • 电子邮箱:guhong@dlut.edu.cn

    访问量:

    开通时间:..

    最后更新时间:..

    Discriminative Motif Discovery via Simulated Evolution and Random Under-Sampling

    点击次数:

    论文类型:期刊论文

    发表时间:2014-02-13

    发表刊物:PLOS ONE

    收录刊物:SCIE、PubMed、Scopus

    卷号:9

    期号:2

    页面范围:e87670

    ISSN号:1932-6203

    摘要:Conserved motifs in biological sequences are closely related to their structure and functions. Recently, discriminative motif discovery methods have attracted more and more attention. However, little attention has been devoted to the data imbalance problem, which is one of the main reasons affecting the performance of the discriminative models. In this article, a simulated evolution method is applied to solve the multi-class imbalance problem at the stage of data preprocessing, and at the stage of Hidden Markov Models (HMMs) training, a random under-sampling method is introduced for the imbalance between the positive and negative datasets. It is shown that, in the task of discovering targeting motifs of nine subcellular compartments, the motifs found by our method are more conserved than the methods without considering data imbalance problem and recover the most known targeting motifs from Minimotif Miner and InterPro. Meanwhile, we use the found motifs to predict protein subcellular localization and achieve higher prediction precision and recall for the minority classes.