论文类型:期刊论文
发表刊物:INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL
收录刊物:SCIE
卷号:14
期号:6
页面范围:1969-1981
ISSN号:1343-4500
关键字:Text Mining; Gene Mention Tagging; Named Entity Recognition
摘要:Gene mention tagging is one of the basic tasks in automatic information extraction from biomedical texts. It is still a challenge because of the irregularity of naming and the frequent appearing of new genes. In this paper, six divergent models are implemented with different machine learning algorithms and dissimilar feature sets. The recognition results from the six models are then combined using the simple set operation method (union and intersection) and the voting method to further improve tagging performance. Experiments conducted on the corpus of BioCreative II GM task show that our best performing integration model achieves an F-score of 88.10%, which outperforms most of the state-of-the-art systems.
