Indexed by:会议论文
Date of Publication:2015-05-19
Included Journals:EI、CPCI-S、Scopus
Volume:9077
Page Number:164-175
Key Words:Classification; Clustering; Dimension reduction
Abstract:As a generalized form of multi-class classification, multilabel classification allows each sample to be associated with multiple labels. This task becomes challenging when the number of labels bulks up, which demands a high efficiency. Many approaches have been proposed to address this problem, among which one of the main ideas is to select a subset of labels which can approximately span the original label space, and training is performed only on the selected set of labels. However, these proposed sampling algorithms either require nondeterministic number of sampling trials or are time consuming. In this paper, we propose two label selection methods for multi-label classification (i) clustering based sampling (CBS) that uses deterministic number of sampling trials; and (ii) frequency based sampling (FBS) utilizing only label frequency statistics which makes it more efficient. Moreover, neither of these two algorithms needs to perform singular value decomposition (SVD) on label matrix which is used in previously mentioned approaches. Experiments are performed on several real world multi-label data sets with the number of labels ranging from hundreds to thousands, and it is shown that the proposed approaches achieve the state-of-the-art performance among label space reduction based multi-label classification algorithms.
Professor
Supervisor of Doctorate Candidates
Supervisor of Master's Candidates
Gender:Male
Alma Mater:Dalian University of Technology
Degree:Doctoral Degree
School/Department:Dalian University of Technology
Discipline:Computer Applied Technology
Business Address:816 Yanjiao Building, Dalian University of Technology
Open time:..
The Last Update Time:..