杨志豪

个人信息Personal Information

教授

博士生导师

硕士生导师

性别:男

毕业院校:大连理工大学

学位:博士

所在单位:计算机科学与技术学院

电子邮箱:yangzh@dlut.edu.cn

扫描关注

论文成果

当前位置: 中文主页 >> 科学研究 >> 论文成果

Disease named entity recognition from biomedical literature using a novel convolutional neural network

点击次数:

论文类型:期刊论文

发表时间:2017-12-28

发表刊物:IEEE International Conference on Bioinformatics and Biomedicine (BIBM) - Medical Genomics

收录刊物:SCIE、CPCI-S、PubMed、SSCI

卷号:10

期号:Suppl 5

页面范围:73

ISSN号:1755-8794

关键字:Disease; Named entity recognition; Convolutional neural network; Deep learning multiple label strategy

摘要:Background: Automatic disease named entity recognition (DNER) is of utmost importance for development of more sophisticated BioNLP tools. However, most conventional CRF based DNER systems rely on well-designed features whose selection is labor intensive and time-consuming. Though most deep learning methods can solve NER problems with little feature engineering, they employ additional CRF layer to capture the correlation information between labels in neighborhoods which makes them much complicated.
   Methods: In this paper, we propose a novel multiple label convolutional neural network (MCNN) based disease NER approach. In this approach, instead of the CRF layer, a multiple label strategy (MLS) first introduced by us, is employed. First, the character-level embedding, word-level embedding and lexicon feature embedding are concatenated. Then several convolutional layers are stacked over the concatenated embedding. Finally, MLS strategy is applied to the output layer to capture the correlation information between neighboring labels.
   Results: As shown by the experimental results, MCNN can achieve the state-of-the-art performance on both NCBI and CDR corpora.
   Conclusions: The proposed MCNN based disease NER method achieves the state-of-the-art performance with little feature engineering. And the experimental results show the MLS strategy's effectiveness of capturing the correlation information between labels in the neighborhood.