location: Current position: Home >> Scientific Research >> Paper Publications

Protein-protein Interaction extraction based on ensemble kernel model and active learning strategy

Hits:

Indexed by:会议论文

Date of Publication:2011-11-27

Included Journals:EI、Scopus

Page Number:9-14

Abstract:Protein-Protein Interaction (PPI) extraction from biomedicine literature can supply the biomedicine researcher with useful information rapidly. This paper presents a PPI extraction system based on the ensemble kernel model and active learning. Firstly, the ensemble kernel within SVM classifier combines the lexical feature-based kernel and the path-based kernel. Experimental results show that the F-score of PPI extraction using ensemble kernel model on AIMED, IEPA and BCPPI corpora are 64.50%, 69.74% and 60.38% respectively with 10-fold cross-validation, which are better than the lexical feature-based kernel and the path-based kernel separately. As the above ensemble kernel model based on SVM needs large labeled data and it is expensive to label data manually, we integrate active learning into the ensemble kernel model. The active learning method uses the uncertainty-based sampling strategy. The experimental results integrating the active learning show that the F-score on AIMED, IEPA and BCPPI corpora are 65.24%, 70.19% and 61.87% respectively, which are better than those using the ensemble kernel model with the passive learning, and meantime reduce the labeling data by 20%, 30% and 30%, respectively. ? 2011 IEEE.

Pre One:Mining english-Chinese named entity pairs from comparable corpora

Next One:TPSH: A novel spectrum handoff approach based on time estimation in dynamic spectrum networks