![]() |
个人信息Personal Information
教授
博士生导师
硕士生导师
性别:女
毕业院校:大连理工大学
学位:博士
所在单位:计算机科学与技术学院
学科:计算机应用技术. 计算机软件与理论
Parallel Information Fusion Method for Microarray Data Analysis
点击次数:
论文类型:会议论文
发表时间:2015-10-29
收录刊物:EI、CPCI-S、SCIE、Scopus
页面范围:1539-1544
关键字:Microarray Data; MapReduce Programming Model; Parallel Information Fusion
摘要:Classification of microarray data has always been a challenging task due to the enormous number of genes. Finding a small, closely related gene set to accurately classify disease cells is an important research problem. Integrating biological knowledge into genomic analysis to help to improve the interpretation of the results is an effective approach. In this paper, affinity propagation (AP) clustering algorithm is chosen to analyze the impact of the biological similarity on the results. We integrate GO semantic similarity into AP clustering for granule construction. Using MapReduce programming model, a parallel information fusion method is proposed. The process of similarity matrix construction and message passing in AP algorithm is parallelized using MapReduce. Parallel randomly directed hill climb ensemble pruning (RandomDHCEP) method based on MapReduce is introduced for ensemble pruning. An instance analysis represents the process of affinity propagation and ensemble pruning by using iterative MapReduce program. The proposed method can offer good scalability on large data with increasing number of nodes and it can also provide higher classification accuracy rather than using whole gene set for classification.