Fundamental Limits of Deep Learning-Based Binary Classifiers Trained with Hinge Loss
Tilahun M. Getu, Georges Kaddoum, M. Bennis
TL;DR
The paper tackles a fundamental open question: what are the intrinsic testing performance limits of deep learning-based binary classifiers trained with hinge loss? It develops an asymptotic theory for two network families—deep ReLU FNNs and deep FNNs with ReLU+Tanh—characterizing misclassification performance in regimes where the penultimate-layer output norms become very large or vanish. The authors prove that, under the stated limits, the misclassification probability cannot beat coin-toss performance ($P_e \le 1/2$), with universal applicability across data sizes, depth, and width; they validate these limits through extensive BPSK-over-AWGN experiments, demonstrating when the theory aligns with practice and when it does not. The work provides a foundational lens for interpreting DL-based binary classifiers, highlighting the gap between empirical gains and fundamental limits and motivating non-asymptotic and multi-class extensions for practical relevance.
Abstract
Although deep learning (DL) has led to several breakthroughs in many disciplines, the fundamental understanding on why and how DL is empirically successful remains elusive. To attack this fundamental problem and unravel the mysteries behind DL's empirical successes, significant innovations toward a unified theory of DL have been made. Although these innovations encompass nearly fundamental advances in optimization, generalization, and approximation, no work has quantified the testing performance of a DL-based algorithm employed to solve a pattern classification problem. To overcome this fundamental challenge in part, this paper exposes the fundamental testing performance limits of DL-based binary classifiers trained with hinge loss. For binary classifiers that are based on deep rectified linear unit (ReLU) feedforward neural networks (FNNs) and deep FNNs with ReLU and Tanh activation, we derive their respective novel asymptotic testing performance limits, which are validated by extensive computer experiments.
