个人信息Personal Information
教授
博士生导师
硕士生导师
任职 : 软件工程研究所副所长
性别:男
毕业院校:大连理工大学
学位:博士
所在单位:软件学院、国际信息与软件学院
电子邮箱:zren@dlut.edu.cn
PRST: A PageRank-Based Summarization Technique for Summarizing Bug Reports with Duplicates
点击次数:
论文类型:期刊论文
发表时间:2017-08-01
发表刊物:INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING
收录刊物:SCIE、EI、Scopus
卷号:27
期号:6
页面范围:869-896
ISSN号:0218-1940
关键字:Duplicate bug reports; summarization; supervised learning; PageRank
摘要:During software maintenance, bug reports are widely employed to improve the software project's quality. A developer often refers to stowed bug reports in a repository for bug resolution. However, this reference process often requires a developer to pursue a substantial amount of textual information in bug reports which is lengthy and tedious. Automatic summarization of bug reports is one way to overcome this problem. Both supervised and unsupervised methods are effectively proposed for the automatic summary generation of bug reports. However, existing methods disregard the significance of duplicate bug reports in summarizing bug reports. In this study, we propose a PageRank-based Summarization Technique (PRST), which utilizes the textual information contained in bug reports and additional information in associated duplicate bug reports. PRST uses three variants of PageRank-based on Vector Space Model (VSM), Jaccard, and WordNet similarity metrics. These variants are utilized to calculate the textual similarity of the sentences between the master bug reports and their duplicates. PRST further trains a regression model and predicts the probability of sentences belonging to the summary. Finally, we combine the values of PageRank and regression model scores to rank the sentences and produce the summary for the master bug reports. In addition, we construct two corpora of bug reports and duplicates, i.e. MBRC and OSCAR. Empirical results suggest that PRST outperforms the state-of-the-art method BRC in terms of Precision, Recall, F-score, and Pyramid Precision. Meanwhile, PRST with WordNet achieves the best results against PRST with VSM and Jaccard.