朴勇

个人信息Personal Information

副教授

硕士生导师

性别:男

毕业院校:大连理工大学

学位:博士

所在单位:软件学院、国际信息与软件学院

办公地点:大连经济开发区大连理工大学软件学院

联系方式:15641190702

电子邮箱:piaoy@dlut.edu.cn

扫描关注

论文成果

当前位置: 中文主页 >> 科学研究 >> 论文成果

A hybrid method for XML clustering by structure and content

点击次数:

论文类型:期刊论文

发表时间:2011-01-01

发表刊物:Journal of Software

收录刊物:Scopus

卷号:6

期号:12 SPEC. ISSUE

页面范围:2361-2368

ISSN号:1796217X

摘要:An effective XML cluster method called neighbor center clustering algorithm (NCC) is presented in this paper, whose similarity is obtained through both structural and content information contained in XML files. Structural similarity is firstly measured by frequency-path model and its similarity calculation algorithm with position and frequency weight by longest common subsequence is introduced. In order to improve the performance and precision, the frequency-path model is further extended by considering the structure and content information simultaneously. Experiments show that the NCC embed with hybrid similarity calculation method can obtain high purity and F-measure value and is effective and applicable for clustering XML with both homogenous and heterogeneous structures. ? 2011 Academy Publisher.