王宇

个人信息Personal Information

教授

硕士生导师

性别:男

毕业院校:吉林大学

学位:博士

所在单位:信息管理与信息系统研究所

学科:信息管理与电子政务

办公地点:管理楼518

电子邮箱:ywang@dlut.edu.cn

扫描关注

论文成果

当前位置: 中文主页 >> 科学研究 >> 论文成果

Text Representation and Classification Based on Multi-Instance Learning

点击次数:

论文类型:会议论文

发表时间:2009-09-14

收录刊物:EI、CPCI-S、Scopus

页面范围:34-39

关键字:bag of sentences; multi-instance learning; text classification; text representation

摘要:In multi-instance learning, the training set comprises labeled bags which are composed of unlabeled instances, and the task is to predict the labels of unseen bags. In this paper, a text mining problem, i.e. text representation, is investigated from a multi-instance view. In detail, each text is regarded as a bag while each of its sentences is regarded as an instance. Bag can be labeled by its class label and its similarity is defined by sentence similarity The text classification problem is translated into multi-instance learning problem. In order to solve this problem, a Chinese text classifier focusing on bag has been built by KNN algorithm and good average precision 92.12% and recall 92.01% have been achieved in the experiments.