An extended asymmetric sigmoid with Perceptron (SIGTRON) for imbalanced linear classification
Hyenkyun Woo
TL;DR
The paper tackles imbalanced and scale-imbalanced classification by introducing SIGTRON, an extended asymmetric sigmoid with Perceptron, and the SIC model, which uses a virtual SIGTRON-induced convex loss with internal parameters $(\alpha_+,\alpha_-)$. It develops a quasi-Newton LBFGS optimization framework with an interval-based line search to efficiently minimize the virtual convex losses. Across 118 diverse datasets, SIC often delivers superior or competitive test accuracy compared with $\pi$-weighted convex focal losses and LIBLINEAR, with binary tasks showing notable gains and multiclass tasks remaining competitive with kernel methods. The work provides insight into skewed hyperplanes and dataset imbalance structure, offering an internally parameterized alternative to external cost-sensitive weighting for imbalanced learning.
Abstract
This article presents a new polynomial parameterized sigmoid called SIGTRON, which is an extended asymmetric sigmoid with Perceptron, and its companion convex model called SIGTRON-imbalanced classification (SIC) model that employs a virtual SIGTRON-induced convex loss function. In contrast to the conventional $π$-weighted cost-sensitive learning model, the SIC model does not have an external $π$-weight on the loss function but has internal parameters in the virtual SIGTRON-induced loss function. As a consequence, when the given training dataset is close to the well-balanced condition considering the (scale-)class-imbalance ratio, we show that the proposed SIC model is more adaptive to variations of the dataset, such as the inconsistency of the (scale-)class-imbalance ratio between the training and test datasets. This adaptation is justified by a skewed hyperplane equation, created via linearization of the gradient satisfying $ε$-optimal condition. Additionally, we present a quasi-Newton optimization(L-BFGS) framework for the virtual convex loss by developing an interval-based bisection line search. Empirically, we have observed that the proposed approach outperforms (or is comparable to) $π$-weighted convex focal loss and balanced classifier LIBLINEAR(logistic regression, SVM, and L2SVM) in terms of test classification accuracy with $51$ two-class and $67$ multi-class datasets. In binary classification problems, where the scale-class-imbalance ratio of the training dataset is not significant but the inconsistency exists, a group of SIC models with the best test accuracy for each dataset (TOP$1$) outperforms LIBSVM(C-SVC with RBF kernel), a well-known kernel-based classifier.
