location: Current position: Home >> Scientific Research >> Paper Publications

A general protein-protein interaction extraction architecture based on word representation and feature selection

Hits:

Indexed by:Journal Papers

Date of Publication:2016-01-01

Journal:INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS

Included Journals:SCIE、Scopus

Volume:14

Issue:3

Page Number:276-291

ISSN No.:1748-5673

Key Words:instance representation; word representation; protein-protein interaction; relation extraction; biomedical text mining

Abstract:Previous researches have shown that supervised Protein-Protein Interaction Extraction (PPIE) can get high accuracies with elaborately selected features and kernels. However, most features and kernels rest upon domain knowledge and natural language analysis, which makes the supervised model expensive, heavy and brittle. Moreover, commonly used representation techniques, such as one-hot encoding and Vector Space Model, fail to capture the semantic similarity between words. To reduce the manual labour and take advantage of semantic representation, we put forward a general instance representation architecture for PPIE, which integrates word representation, vector composition and feature selection. Our method obtains F-scores of 69.7, 78.8, 72.3, 72.0 and 83.7 on AIMed, BioInfer, HPRD50, IEPA and LLL respectively.

Pre One:Extracting Biomedical Event Using Feature Selection and Word Representation

Next One:The Feature Selection Algorithm Based on Feature Overlapping and Group Overlapping