周惠巍

个人信息Personal Information

副教授

博士生导师

硕士生导师

性别:女

毕业院校:大连理工大学

学位:博士

所在单位:计算机科学与技术学院

办公地点:大连理工大学创新园大厦B911

电子邮箱:zhouhuiwei@dlut.edu.cn

扫描关注

论文成果

当前位置: 中文主页 >> 科学研究 >> 论文成果

Co-training for detecting hedges and their scope in biomedical texts

点击次数:

论文类型:期刊论文

发表时间:2015-10-15

发表刊物:Journal of Computational Information Systems

收录刊物:EI、Scopus

卷号:11

期号:20

页面范围:7387-7395

ISSN号:15539105

摘要:To avoid extracting uncertain statements as factual information, the detection of hedges and their scope becomes an important step in biomedical text mining. The current approaches focus on learning the detection models only with the labeled data. However, such approaches cannot make further progress due to the limited amount of training data and the difference between the training and working data. We proposes a co-training approach to make use of the limited labeled data to leverage some amounts of unlabeled data for boosting the detection performances of hedge cues and their scope. Experiments are carried out on the biomedical corpus of the CoNLL 2010 Shared Task and on free data derived from biomedical literature. Both the test data of the corpus and the free data are used as the unlabeled data. Experiment results show that the test data helps more than the free data on both tasks. The best F-score achieved in hedge cue identification is 88.12% and for hedge scope detection it is 63.09%, which significantly outperform previous systems. Co-training system can transfer the distribution of the unlabeled data to the labeled training data to improve the performance on the unlabeled data effectively. Copyright ? 2015 Binary Information Press.