顾宏
开通时间:..
最后更新时间:..
点击次数:
论文类型:期刊论文
发表时间:2016-08-15
发表刊物:ANALYTICAL BIOCHEMISTRY
收录刊物:SCIE、PubMed、Scopus
卷号:507
页面范围:1-6
ISSN号:0003-2697
关键字:Post-translational modification; Pupylation; Semi-supervised learning; Support vector machine; k-spaced amino acid pair
摘要:As one important post-translational modification of prokaryotic proteins, pupylation plays a key role in regulating various biological processes. The accurate identification of pupylation sites is crucial for understanding the underlying mechanisms of pupylation. Although several computational methods have been developed for the identification of pupylation sites, the prediction accuracy of them is still unsatisfactory. Here, a novel bioinformatics tool named IMP-PUP is proposed to improve the prediction of pupylation sites. IMP-PUP is constructed on the composition of k-spaced amino acid pairs and trained with a modified semi-supervised self-training support vector machine (SVM) algorithm. The proposed algorithm iteratively trains a series of support vector machine classifiers on both annotated and non annotated pupylated proteins. Computational results show that IMP-PUP achieves the area under receiver operating characteristic curves of 0.91, 0.73, and 0.75 on our training set, Tung's testing set, and our testing set, respectively, which are better than those of the different error costs SVM algorithm and the original self-training SVM algorithm. Independent tests also show that IMP-PUP significantly outperforms three other existing pupylation site predictors: GPS-PUP, iPUP, and pbPUP. Therefore, IMP-PUP can be a useful tool for accurate prediction of pupylation sites. A MATLAB software package for IMP-PUP is available at https://juzhe1120.githubio/. (C) 2016 Elsevier Inc. All rights reserved.