个人信息Personal Information
教授
博士生导师
硕士生导师
性别:女
毕业院校:大连理工大学
学位:博士
所在单位:计算机科学与技术学院
学科:计算机应用技术. 计算机软件与理论
办公地点:创新大厦A930
电子邮箱:lils@dlut.edu.cn
Two-phase biomedical named entity recognition using CRFs
点击次数:
论文类型:期刊论文
发表时间:2009-08-01
发表刊物:COMPUTATIONAL BIOLOGY AND CHEMISTRY
收录刊物:SCIE、EI、PubMed、Scopus
卷号:33
期号:4
页面范围:334-338
ISSN号:1476-9271
关键字:Text mining; Biomedical named entity recognition; Named entity detection; Named entity classification; Conditional random fields
摘要:As a fundamental step of biomedical text mining, Biomedical Named Entity Recognition (Bio-NER) remains a challenging task. This paper explores a so-called two-phase approach to identify biomedical entities, in which the recognition task is divided into two subtasks: Named Entity Detection (NED) and Named Entity Classification (NEC). And the two subtasks are finished in two phases. At the first phase, we try to identify each named entity with a Conditional Random Fields (CRFs) model without identifying its type; at the second phase, another CRFs model is used to determine the correct entity type for each identified entity. This treatment can reduce the training time significantly and furthermore, more relevant features can be selected for each subtask. In order to achieve a better performance, post-processing algorithms are employed before NEC subtask. Experiments conducted on JNLPBA2004 datasets show that our two-phase approach can achieve an F-score of 74.31%, which outperforms most of the state-of-the-art systems. (C) 2009 Elsevier Ltd. All rights reserved.