Table of Contents
Fetching ...

NPSVC++: Nonparallel Classifiers Encounter Representation Learning

Junhong Zhang, Zhihui Lai, Jie Zhou, Guangfei Liang

TL;DR

This work introduces NPSVC++, a multi-objective, Pareto-aware framework that enables end-to-end representation learning for nonparallel classifiers. By coupling class-specific hyperplane objectives with a shared representation through a weighted Chebyshev formulation, it achieves Pareto stationarity and feature optimality across classes, addressing both feature suboptimality and class dependency. The authors present two realizations: K-NPSVC++, a kernel-based method on RKHS with a Stiefel projection, and D-NPSVC++, a deep-learning variant with a skip-connection hypothesis function and a two-step training scheme. Empirical results show that NPSVC++ improves over traditional SVMs and deep softmax baselines on multiple benchmarks, while offering competitive training efficiency, demonstrating the practical impact of Pareto-aware end-to-end learning for nonparallel classifiers.

Abstract

This paper focuses on a specific family of classifiers called nonparallel support vector classifiers (NPSVCs). Different from typical classifiers, the training of an NPSVC involves the minimization of multiple objectives, resulting in the potential concerns of feature suboptimality and class dependency. Consequently, no effective learning scheme has been established to improve NPSVCs' performance through representation learning, especially deep learning. To break this bottleneck, we develop NPSVC++ based on multi-objective optimization, enabling the end-to-end learning of NPSVC and its features. By pursuing Pareto optimality, NPSVC++ theoretically ensures feature optimality across classes, hence effectively overcoming the two issues above. A general learning procedure via duality optimization is proposed, based on which we provide two applicable instances, K-NPSVC++ and D-NPSVC++. The experiments show their superiority over the existing methods and verify the efficacy of NPSVC++.

NPSVC++: Nonparallel Classifiers Encounter Representation Learning

TL;DR

This work introduces NPSVC++, a multi-objective, Pareto-aware framework that enables end-to-end representation learning for nonparallel classifiers. By coupling class-specific hyperplane objectives with a shared representation through a weighted Chebyshev formulation, it achieves Pareto stationarity and feature optimality across classes, addressing both feature suboptimality and class dependency. The authors present two realizations: K-NPSVC++, a kernel-based method on RKHS with a Stiefel projection, and D-NPSVC++, a deep-learning variant with a skip-connection hypothesis function and a two-step training scheme. Empirical results show that NPSVC++ improves over traditional SVMs and deep softmax baselines on multiple benchmarks, while offering competitive training efficiency, demonstrating the practical impact of Pareto-aware end-to-end learning for nonparallel classifiers.

Abstract

This paper focuses on a specific family of classifiers called nonparallel support vector classifiers (NPSVCs). Different from typical classifiers, the training of an NPSVC involves the minimization of multiple objectives, resulting in the potential concerns of feature suboptimality and class dependency. Consequently, no effective learning scheme has been established to improve NPSVCs' performance through representation learning, especially deep learning. To break this bottleneck, we develop NPSVC++ based on multi-objective optimization, enabling the end-to-end learning of NPSVC and its features. By pursuing Pareto optimality, NPSVC++ theoretically ensures feature optimality across classes, hence effectively overcoming the two issues above. A general learning procedure via duality optimization is proposed, based on which we provide two applicable instances, K-NPSVC++ and D-NPSVC++. The experiments show their superiority over the existing methods and verify the efficacy of NPSVC++.
Paper Structure (24 sections, 2 theorems, 57 equations, 8 figures, 4 tables, 2 algorithms)

This paper contains 24 sections, 2 theorems, 57 equations, 8 figures, 4 tables, 2 algorithms.

Key Result

Theorem 1

The dual problem of eq:Wcheb-prob is where ${\bm\tau}\in{\mathbb R}_+^K$ denotes Lagrange multipliers.

Figures (8)

  • Figure 1: Left: NPSVC usually uses learning-freed feature transformation and optimizes the hyperplane of each class independently, resulting in feature suboptimality and disregarding class dependency. Right: NPSVC++ stems from multi-objective optimization, whose goal is to achieve the Pareto optimality. Thus, feature optimality across different classes is ensured, tackling the issues of NPSVCs.
  • Figure 2: Training evolution of K- and D-NPSVC++.
  • Figure 3: TSNE visualization of raw data and learned features.
  • Figure 4: Ablation study of D-NPSVC++ in hypothesis function. Dashed lines indicate the convergence accuracy.
  • Figure A.1: The architecture of D-NPSVC++ in the experiments. The prior encoder in the experiments is ResNet34. The "FC", "BN", and "LN" denote fully connected, batch normalization, and layer normalization layers respectively.
  • ...and 3 more figures

Theorems & Definitions (5)

  • Definition 1: Pareto optimality boyd2004CVXOPT
  • Definition 2: Pareto stationarity MGDA2012
  • Theorem 1
  • Theorem 2: MGDA2012momma2022multiobj
  • proof