个人信息Personal Information
教授
硕士生导师
性别:男
毕业院校:吉林大学
学位:博士
所在单位:信息管理与信息系统研究所
学科:信息管理与电子政务
办公地点:管理楼518
电子邮箱:ywang@dlut.edu.cn
Algorithm of the text copy detection based on topic bag
点击次数:
论文类型:会议论文
发表时间:2010-10-23
收录刊物:EI、Scopus
卷号:1
页面范围:285-288
摘要:In order to resolve the current problem about seriously academic plagiarism in the web environment, this article proposes an algorithm of the text copy detection on the topic bag and the algorithm uses the idea of semantic clustering and multi-instance learning. Firstly, a paper is divided into three layers construction tree: a leaf node denotes a sentence; a branch node represents a topic bag, and the topic bag formed by semantic clustering of several paragraphs; the uppermost a root node is a text. Secondly, the similarities of topic bags are calculated by the similarities of sentences; then we can get the similarity of two papers by similarities and weights of topic bags. Experiments show that the proposed algorithm has higher accuracy. ? 2010 IEEE.