A Margin-based Multiclass Generalization Bound via Geometric Complexity
Michael Munn, Benoit Dherin, Javier Gonzalvo
TL;DR
The paper addresses the question of why deep neural networks generalize well by introducing a margin-based multiclass generalization bound that scales with the geometric complexity (GC) of the network. It proves a bound under data distributions satisfying a Poincaré inequality, linking the generalization error to the margin and to the margin-normalized GC, and extends the analysis from binary to multiclass settings using a covering-number and Dudley integral approach. Empirical validation on ResNet-18 trained on CIFAR-10 and CIFAR-100 with both original and random labels shows a strong correlation between GC and excess risk, with margin normalization stabilizing GC across training. The results offer an architecture-agnostic perspective on generalization, highlight the role of data geometry, and suggest GC as a practical proxy for assessing and possibly guiding generalization in neural networks.
Abstract
There has been considerable effort to better understand the generalization capabilities of deep neural networks both as a means to unlock a theoretical understanding of their success as well as providing directions for further improvements. In this paper, we investigate margin-based multiclass generalization bounds for neural networks which rely on a recent complexity measure, the geometric complexity, developed for neural networks. We derive a new upper bound on the generalization error which scales with the margin-normalized geometric complexity of the network and which holds for a broad family of data distributions and model classes. Our generalization bound is empirically investigated for a ResNet-18 model trained with SGD on the CIFAR-10 and CIFAR-100 datasets with both original and random labels.
