Multi-Class Imbalanced Learning with Support Vector Machines via Differential Evolution
Zhong-Liang Zhang, Jie Yang, Jian-Ming Ru, Xiao-Xi Zhao, Xing-Gang Luo
TL;DR
This work tackles multi-class imbalanced classification using support vector machines by introducing i-SVM-DE, which integrates cost-sensitive and margin-modified constraints into an enhanced SVM (i-SVM) and resolves a large concatenated parameter optimization problem with differential evolution. Employing an one-vs-one (OVO) framework, the $M$-class problem is transformed into $M(M-1)/2$ binary subproblems, whose parameters are jointly optimized to learn class-specific support vectors. Two fitness-guided DE variants, i-SVM-DE-AVE and i-SVM-DE-MAX, are evaluated on 15 KEEL datasets, showing statistically superior Av$F_{eta}$ and CBA performance, with SDC leading in G-mean. The approach achieves strong performance without a separate validation set, making it particularly appealing for small-sample, imbalanced tasks, albeit with higher computational cost.
Abstract
Support vector machine (SVM) is a powerful machine learning algorithm to handle classification tasks. However, the classical SVM is developed for binary problems with the assumption of balanced datasets. Obviously, the multi-class imbalanced classification problems are more complex. In this paper, we propose an improved SVM via Differential Evolution (i-SVM-DE) method to deal with it. An improved SVM (i-SVM) model is proposed to handle the data imbalance by combining cost sensitive technique and separation margin modification in the constraints, which formalize a parameter optimization problem. By using one-versus-one (OVO) scheme, a multi-class problem is decomposed into a number of binary subproblems. A large optimization problem is formalized through concatenating the parameters in the binary subproblems. To find the optimal model effectively and learn the support vectors for each class simultaneously, an improved differential evolution (DE) algorithm is applied to solve this large optimization problem. Instead of the validation set, we propose the fitness functions to evaluate the learned model and obtain the optimal parameters in the search process of DE. A series of experiments are carried out to verify the benefits of our proposed method. The results indicate that i-SVM-DE is statistically superior by comparing with the other baseline methods.
