Current position: Home >> Scientific Research >> Paper Publications

MT-Oriented English PoS Tagging and Its Application to Noun Phrase Chunking

Release Time:2019-03-09  Hits:

Indexed by: Journal Article

Date of Publication: 2012-03-01

Journal: CHINA COMMUNICATIONS

Included Journals: Scopus、SCIE

Volume: 9

Issue: 3

Page Number: 58-67

ISSN: 1673-5447

Key Words: English PoS tagging; maximum entropy; rule-based approach; machine translation; NP chunking

Abstract: A hybrid approach to English Part-of-Speech (PoS) tagging with its target application being English-Chinese machine translation in business domain is presented, demonstrating how a present tagger can be adapted to learn from a small amount of data and handle unknown words for the purpose of machine translation. A small size of 998 k English annotated corpus in business domain is built semiautomatically based on a new tagset; the maximum entropy model is adopted, and rule-based approach is used in post-processing. The tagger is further applied in Noun Phrase (NP) chunking. Experiments show that our tagger achieves an accuracy of 98.14%, which is a quite satisfactory result. In the application to NP chunking, the tagger gives rise to 2.21% increase in F-score, compared with the results using Stanford tagger.

Prev One:Context Information and Fragments Based Cross-Domain Word Segmentation

Next One:Boosting performance of gene mention tagging system by hybrid methods