江贺

个人信息Personal Information

教授

博士生导师

硕士生导师

性别:男

毕业院校:中国科技大学

学位:博士

所在单位:软件学院、国际信息与软件学院

联系方式:jianghe@dlut.edu.cn

扫描关注

论文成果

当前位置: jianghe >> 科学研究 >> 论文成果

PRST: A PageRank-Based Summarization Technique for Summarizing Bug Reports with Duplicates

点击次数:

论文类型:期刊论文

发表时间:2017-08-01

发表刊物:INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING

收录刊物:SCIE、EI、Scopus

卷号:27

期号:6

页面范围:869-896

ISSN号:0218-1940

关键字:Duplicate bug reports; summarization; supervised learning; PageRank

摘要:During software maintenance, bug reports are widely employed to improve the software project's quality. A developer often refers to stowed bug reports in a repository for bug resolution. However, this reference process often requires a developer to pursue a substantial amount of textual information in bug reports which is lengthy and tedious. Automatic summarization of bug reports is one way to overcome this problem. Both supervised and unsupervised methods are effectively proposed for the automatic summary generation of bug reports. However, existing methods disregard the significance of duplicate bug reports in summarizing bug reports. In this study, we propose a PageRank-based Summarization Technique (PRST), which utilizes the textual information contained in bug reports and additional information in associated duplicate bug reports. PRST uses three variants of PageRank-based on Vector Space Model (VSM), Jaccard, and WordNet similarity metrics. These variants are utilized to calculate the textual similarity of the sentences between the master bug reports and their duplicates. PRST further trains a regression model and predicts the probability of sentences belonging to the summary. Finally, we combine the values of PageRank and regression model scores to rank the sentences and produce the summary for the master bug reports. In addition, we construct two corpora of bug reports and duplicates, i.e. MBRC and OSCAR. Empirical results suggest that PRST outperforms the state-of-the-art method BRC in terms of Precision, Recall, F-score, and Pyramid Precision. Meanwhile, PRST with WordNet achieves the best results against PRST with VSM and Jaccard.