Release Time:2019-03-09 Hits:
Indexed by: Journal Article
Date of Publication: 2012-11-01
Journal: PROTEIN AND PEPTIDE LETTERS
Included Journals: Scopus、PubMed、SCIE
Volume: 19
Issue: 11
Page Number: 1163-1169
ISSN: 0929-8665
Key Words: Class-imbalance; K-nearest neighbor; multi-label learning; pseudo amino acid composition; subcellular localization
Abstract: Machine learning is a kind of reliable technology for automated subcellular localization of viral proteins within a host cell or virus-infected cell. One challenge is that the viral protein samples are not only with multiple location sites, but also class-imbalanced. The imbalanced dataset often decreases the prediction performance. In order to accomplish this challenge, this paper proposes a novel approach named imbalance-weighted multi-label K-nearest neighbor to predict viral protein subcellular location with multiple sites. The experimental results by jackknife test indicate that the presented algorithm achieves a better performance than the existing methods and has great potentials in protein science.