location: Current position: Home >> Scientific Research >> Paper Publications

Text Representation and Classification Based on Multi-Instance Learning

Hits:

Indexed by:会议论文

Date of Publication:2009-09-14

Included Journals:EI、CPCI-S、Scopus

Page Number:34-39

Key Words:bag of sentences; multi-instance learning; text classification; text representation

Abstract:In multi-instance learning, the training set comprises labeled bags which are composed of unlabeled instances, and the task is to predict the labels of unseen bags. In this paper, a text mining problem, i.e. text representation, is investigated from a multi-instance view. In detail, each text is regarded as a bag while each of its sentences is regarded as an instance. Bag can be labeled by its class label and its similarity is defined by sentence similarity The text classification problem is translated into multi-instance learning problem. In order to solve this problem, a Chinese text classifier focusing on bag has been built by KNN algorithm and good average precision 92.12% and recall 92.01% have been achieved in the experiments.

Pre One:港口数据立方体的关联规则挖掘

Next One:基于句子关系图的网页文本主题句抽取