教授 博士生导师 硕士生导师
性别: 男
毕业院校: 东北师范大学
学位: 博士
所在单位: 生物工程学院
学科: 生物化工. 生物化学与分子生物学. 生物工程
办公地点: 生物工程学院401室
联系方式: 13624087256
电子邮箱: luanyush@dlut.edu.cn
开通时间: ..
最后更新时间: ..
点击次数:
论文类型: 会议论文
发表时间: 2013-12-13
收录刊物: EI、CPCI-S、Scopus
页面范围: 342-346
关键字: Protein prediction; Protein granularity; Feature extraction
摘要: Assigning biological function to uncharacterized proteins is a fundamental problem in the post-genomic age. The increasing availability of large amounts of data on protein sequences has led to the emergence of developing effective computational methods for quickly and accurately predicting their functions. In this work, we extract 353 numerical features from sequences based not only on physiochemical properties but also on protein granularity. A tool in exploratory data analysis, Principal Component Analysis (PCA), is applied to obtain an optimized feature set by excluding poor-performed or redundant features, resulting in 204 remaining features. Then the optimized 204-feature subset is used to predict protein function with k-nearest neighbors algorithm (KNN). This prediction model achieves an overall accurate prediction rate of 84.6%. The experiment results show that our approach is quite efficient to predict functional class of unknown proteins.