大连理工大学主页平台管理系统许侃--许侃-- Learning to rank using smoothing methods for language modeling

许侃

点赞：

高级工程师

性别：男

毕业院校：大连理工大学

学位：博士

所在单位：计算机科学与技术学院

学科：计算机应用技术

办公地点：创新园大厦D0103房间

联系方式： QQ：2407849530

电子邮箱： xukan@dlut.edu.cn

qq : 2407849530

手机版

访问量：

开通时间： ..

最后更新时间： ..

个人学术主页

当前位置: 许侃 >> 科学研究 >> 论文成果

Learning to rank using smoothing methods for language modeling

点击次数：

论文类型：期刊论文

发表时间： 2013-04-01

发表刊物： JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY

收录刊物： SCIE、EI、SSCI、Scopus

卷号： 64

期号： 4

页面范围： 818-828

ISSN号： 1532-2882

关键字： machine learning; information retrieval; searching

摘要： The central issue in language model estimation is smoothing, which is a technique for avoiding zero probability estimation problem and overcoming data sparsity. There are three representative smoothing methods: Jelinek-Mercer (JM) method; Bayesian smoothing using Dirichlet priors (Dir) method; and absolute discounting (Dis) method, whose parameters are usually estimated empirically. Previous research in information retrieval (IR) on smoothing parameter estimation tends to select a single value from optional values for the collection, but it may not be appropriate for all the queries. The effectiveness of all the optional values should be considered to improve the ranking performance. Recently, learning to rank has become an effective approach to optimize the ranking accuracy by merging the existing retrieval methods. In this article, the smoothing methods for language modeling in information retrieval (LMIR) with different parameters are treated as different retrieval methods, then a learning to rank approach to learn a ranking model based on the features extracted by smoothing methods is presented. In the process of learning, the effectiveness of all the optional smoothing parameters is taken into account for all queries. The experimental results on the Learning to Rank for Information Retrieval (LETOR) LETOR3.0 and LETOR4.0 data sets show that our approach is effective in improving the performance of LMIR.

上一条： LTE移动通信技术专利竞争情报分析

下一条：中西思维模式对于情感倾向性的影响