顾宏
开通时间:..
最后更新时间:..
点击次数:
论文类型:期刊论文
发表时间:2011-03-01
发表刊物:NEURAL COMPUTING & APPLICATIONS
收录刊物:Scopus、SCIE、EI
卷号:20
期号:2
页面范围:203-209
ISSN号:0941-0643
关键字:Support vector machines; Imbalanced classification; Resample; SMOTE; Ensemble learning
摘要:Imbalanced data sets often have detrimental effects on the performance of a conventional support vector machine (SVM). To solve this problem, we adopt both strategies of modifying the data distribution and adjusting the classifier. Both minority and majority classes are resampled to increase the generalization ability. For minority class, an one-class support vector machine model combined with synthetic minority oversampling technique is used to oversample the support vector instances. For majority class, we propose a new method to decompose the majority class into clusters and remove two clusters using a distance measure to lessen the effect of outliers. The remaining clusters are used to build an SVM ensemble with the oversampled minority patterns, the SVM ensemble can achieve better performance by considering potentially suboptimal solutions. Experimental results on benchmark data sets are provided to illustrate the effectiveness of the proposed method.