顾宏
开通时间:..
最后更新时间:..
点击次数:
论文类型:期刊论文
发表时间:2012-11-01
发表刊物:PROTEIN AND PEPTIDE LETTERS
收录刊物:SCIE、PubMed、Scopus
卷号:19
期号:11
页面范围:1163-1169
ISSN号:0929-8665
关键字:Class-imbalance; K-nearest neighbor; multi-label learning; pseudo amino acid composition; subcellular localization
摘要:Machine learning is a kind of reliable technology for automated subcellular localization of viral proteins within a host cell or virus-infected cell. One challenge is that the viral protein samples are not only with multiple location sites, but also class-imbalanced. The imbalanced dataset often decreases the prediction performance. In order to accomplish this challenge, this paper proposes a novel approach named imbalance-weighted multi-label K-nearest neighbor to predict viral protein subcellular location with multiple sites. The experimental results by jackknife test indicate that the presented algorithm achieves a better performance than the existing methods and has great potentials in protein science.