马建军

个人信息Personal Information

教授

硕士生导师

性别:女

毕业院校:大连理工大学

学位:博士

所在单位:外国语学院

学科:外国语言学及应用语言学

办公地点:文科楼107

联系方式:majian@dlut.edu.cn

电子邮箱:majian@dlut.edu.cn

扫描关注

论文成果

当前位置: 中文主页 >> 科学研究 >> 论文成果

Identification of English prepositional phrases within business domain for machine translation

点击次数:

论文类型:期刊论文

发表时间:2013-10-10

发表刊物:Journal of Information and Computational Science

收录刊物:EI、Scopus

卷号:10

期号:15

页面范围:4849-4860

ISSN号:15487741

摘要:An MT-oriented system using Conditional Random Fields (CRFs) is presented to identify English Prepositional Phrases (PPs) within business domain. For the purpose of English-Chinese Machine Translation (MT), we, under the guidance of the theory of Syntactic Functional Grammar (SFG), refine PP function chunks into four types instead of the binary attachment. In order to improve the identification of these chunk types, we revise the Penn Treebank tagset with four major changes being made. A small size of 998k English annotated corpus in business domain is semi-automatically built based on our new tagset employing the Maximum Entropy model. Experiments show that our system achieves an accuracy of 88.45%, higher than other reported approaches. The adjustments made in the PP chunk types and POS tagset give rise to 4.11%, 4.25% and 4.15% increase in the precision, recall and F-score respectively. ? 2013 Binary Information Press.