location: Current position: Home >> Scientific Research >> Paper Publications

Parallel Information Fusion Method for Microarray Data Analysis

Hits:

Indexed by:会议论文

Date of Publication:2015-10-29

Included Journals:EI、CPCI-S、SCIE、Scopus

Page Number:1539-1544

Key Words:Microarray Data; MapReduce Programming Model; Parallel Information Fusion

Abstract:Classification of microarray data has always been a challenging task due to the enormous number of genes. Finding a small, closely related gene set to accurately classify disease cells is an important research problem. Integrating biological knowledge into genomic analysis to help to improve the interpretation of the results is an effective approach. In this paper, affinity propagation (AP) clustering algorithm is chosen to analyze the impact of the biological similarity on the results. We integrate GO semantic similarity into AP clustering for granule construction. Using MapReduce programming model, a parallel information fusion method is proposed. The process of similarity matrix construction and message passing in AP algorithm is parallelized using MapReduce. Parallel randomly directed hill climb ensemble pruning (RandomDHCEP) method based on MapReduce is introduced for ensemble pruning. An instance analysis represents the process of affinity propagation and ensemble pruning by using iterative MapReduce program. The proposed method can offer good scalability on large data with increasing number of nodes and it can also provide higher classification accuracy rather than using whole gene set for classification.

Pre One:Inferring plant microRNA functional similarity using a weighted protein-protein interaction network

Next One:Classification by integrating plant stress response gene expression data with biological knowledge