Release Time:2019-03-09 Hits:
Indexed by: Journal Article
Date of Publication: 2009-12-01
Journal: INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL
Included Journals: Scopus、EI、SCIE
Volume: 5
Issue: 12A
Page Number: 4523-4530
ISSN: 1349-4198
Key Words: Chinese morphological analysis; MMSM model; CRF; Hidden semi-CRF
Abstract: In this paper, we describe a scheme for Chinese word segmentation and POS tagging which integrates the character-based and word-based information in the directed graph generated by the MMSM-model. Word-level information is effective for analysis of known words, while character-level information is useful for analysis of unknown, words. A Hidden semi-CRF model is proposed for the unknown words detection. and POS tagging. The proposed Hidden semi-CRF has two state chains with unequal states which Can perform segmentation and POS tagging of unknown words simultaneously. The hybrid model was evaluated using the test data from SIGHAN-6 and achieved higher F-score than the stage-of-the-art models.