![]() |
个人信息Personal Information
教授
博士生导师
硕士生导师
性别:女
毕业院校:大连理工大学
学位:博士
所在单位:计算机科学与技术学院
学科:计算机应用技术
办公地点:创新园大厦B811
联系方式:0411-84706009-2811
电子邮箱:wangjian@dlut.edu.cn
A neural network approach to chemical and gene/protein entity recognition in patents
点击次数:
论文类型:期刊论文
发表时间:2018-12-18
发表刊物:JOURNAL OF CHEMINFORMATICS
收录刊物:SCIE、PubMed、Scopus
卷号:10
期号:1
页面范围:65
ISSN号:1758-2946
关键字:Patents; Biomedical entity recognition; Deep learning; Long short-term memory; Conditional random field
摘要:In biomedical research, patents contain the significant amount of information, and biomedical text mining has received much attention in patents recently. To accelerate the development of biomedical text mining for patents, the BioCreative V.5 challenge organized three tracks, i.e., chemical entity mention recognition (CEMP), gene and protein related object recognition (GPRO) and technical interoperability and performance of annotation servers, to focus on biomedical entity recognition in patents. This paper describes our neural network approach for the CEMP and GPRO tracks. In the approach, a bidirectional long short-term memory with a conditional random field layer is employed to recognize biomedical entities from patents. To improve the performance, we explored the effect of additional features (i.e., part of speech, chunking and named entity recognition features generated by the GENIA tagger) for the neural network model. In the official results, our best runs achieve the highest performances (a precision of 88.32%, a recall of 92.62%, and an F-score of 90.42% in the CEMP track; a precision of 76.65%, a recall of 81.91%, and an F-score of 79.19% in the GPRO track) among all participating teams in both tracks.