Hits:
Indexed by:期刊论文
Date of Publication:2011-03-01
Journal:NEURAL COMPUTING & APPLICATIONS
Included Journals:Scopus、SCIE、EI
Volume:20
Issue:2
Page Number:203-209
ISSN No.:0941-0643
Key Words:Support vector machines; Imbalanced classification; Resample; SMOTE; Ensemble learning
Abstract:Imbalanced data sets often have detrimental effects on the performance of a conventional support vector machine (SVM). To solve this problem, we adopt both strategies of modifying the data distribution and adjusting the classifier. Both minority and majority classes are resampled to increase the generalization ability. For minority class, an one-class support vector machine model combined with synthetic minority oversampling technique is used to oversample the support vector instances. For majority class, we propose a new method to decompose the majority class into clusters and remove two clusters using a distance measure to lessen the effect of outliers. The remaining clusters are used to build an SVM ensemble with the oversampled minority patterns, the SVM ensemble can achieve better performance by considering potentially suboptimal solutions. Experimental results on benchmark data sets are provided to illustrate the effectiveness of the proposed method.