A Unified Optimization Framework for Multiclass Classification with Structured Hyperplane Arrangements
Víctor Blanco, Harshit Kothari, James Luedtke
TL;DR
The paper addresses multiclass classification by proposing a unified optimization framework based on structured hyperplane arrangements, formulated as a MIP that generalizes the SVM margin principle to multiple hyperplanes. It introduces a computationally more efficient New Formulation that reduces binary variables to $(n + k)|\mathcal{C}|$ with $|\mathcal{C}| = 2^m$, and extends to decision-tree structures and kernelized nonlinear boundaries via a kernel trick. A dynamic clustering matheuristic is developed to scale to large datasets, and extensive experiments on synthetic and UCI datasets show competitive accuracy with substantial computational gains over prior discrete optimization approaches. The framework offers interpretable, margin-based multiclass classifiers and provides avenues for robustness and decomposition-based scalability improvements.
Abstract
In this paper, we propose a new mathematical optimization model for multiclass classification based on arrangements of hyperplanes. Our approach preserves the core support vector machine (SVM) paradigm of maximizing class separation while minimizing misclassification errors, and it is computationally more efficient than a previous formulation. We present a kernel-based extension that allows it to construct nonlinear decision boundaries. Furthermore, we show how the framework can naturally incorporate alternative geometric structures, including classification trees, $\ell_p$-SVMs, and models with discrete feature selection. To address large-scale instances, we develop a dynamic clustering matheuristic that leverages the proposed MIP formulation. Extensive computational experiments demonstrate the efficiency of the proposed model and dynamic clustering heuristic, and we report competitive classification performance on both synthetic datasets and real-world benchmarks from the UCI Machine Learning Repository, comparing our method with state-of-the-art implementations available in scikit-learn.
