赵亮

个人信息Personal Information

教授

博士生导师

硕士生导师

性别:男

毕业院校:大连理工大学

学位:博士

所在单位:控制科学与工程学院

学科:控制理论与控制工程. 模式识别与智能系统

办公地点:创新园大厦A711

联系方式:1388-9695-114

电子邮箱:zliang@dlut.edu.cn

扫描关注

论文成果

当前位置: 中文主页 >> 科学研究 >> 论文成果

Mining large-scale comparable corpora from Chinese-English news collections

点击次数:

论文类型:会议论文

发表时间:2010-08-23

收录刊物:EI、Scopus

卷号:2

页面范围:472-480

摘要:In this paper, we explore a CLIR-based? approach to construct large-scale Chinese- English comparable corpora, which is valuable for translation knowledge mining. The initial source and target document sets are crawled from news website and standardized uniformly. Keywords are extracted from the source document firstly, and then the extracted keywords are translated and combined as query words through certain criteria to retrieve against the index created using target document set. Meanwhile, the mapping correlations between source and target documents are developed according to the value of similarity calculated by the retrieval tool. Two methods are evaluated to filter the comparable document pairs so as to ensure the quality of the comparable corpora. Experimental results indicate that our approach is effective on the construction of Chinese- English comparable corpora.