Hits:
Indexed by:期刊论文
Date of Publication:2009-12-01
Journal:INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL
Included Journals:SCIE、EI、Scopus
Volume:5
Issue:12A
Page Number:4523-4530
ISSN No.:1349-4198
Key Words:Chinese morphological analysis; MMSM model; CRF; Hidden semi-CRF
Abstract:In this paper, we describe a scheme for Chinese word segmentation and POS tagging which integrates the character-based and word-based information in the directed graph generated by the MMSM-model. Word-level information is effective for analysis of known words, while character-level information is useful for analysis of unknown, words. A Hidden semi-CRF model is proposed for the unknown words detection. and POS tagging. The proposed Hidden semi-CRF has two state chains with unequal states which Can perform segmentation and POS tagging of unknown words simultaneously. The hybrid model was evaluated using the test data from SIGHAN-6 and achieved higher F-score than the stage-of-the-art models.