个人信息Personal Information
教授
硕士生导师
性别:男
毕业院校:吉林大学
学位:博士
所在单位:信息管理与信息系统研究所
学科:信息管理与电子政务
办公地点:管理楼518
电子邮箱:ywang@dlut.edu.cn
Text Representation and Classification Based on Multi-Instance Learning
点击次数:
论文类型:会议论文
发表时间:2009-09-14
收录刊物:EI、CPCI-S、Scopus
页面范围:34-39
关键字:bag of sentences; multi-instance learning; text classification; text representation
摘要:In multi-instance learning, the training set comprises labeled bags which are composed of unlabeled instances, and the task is to predict the labels of unseen bags. In this paper, a text mining problem, i.e. text representation, is investigated from a multi-instance view. In detail, each text is regarded as a bag while each of its sentences is regarded as an instance. Bag can be labeled by its class label and its similarity is defined by sentence similarity The text classification problem is translated into multi-instance learning problem. In order to solve this problem, a Chinese text classifier focusing on bag has been built by KNN algorithm and good average precision 92.12% and recall 92.01% have been achieved in the experiments.