A Generalization Bound for a Family of Implicit Networks
Samy Wu Fung, Benjamin Berkels
TL;DR
This work derives a generalization bound for a broad family of implicit neural networks defined by contractive fixed-point operators. By bounding the Rademacher complexity via a covering-number argument and Dudley’s inequality, the authors obtain a bound that scales with the parameter count $p$ and is largely depth-agnostic. The bound applies across architectures such as single-layer contractive networks, Monotone Equilibrium Networks, and gradient-descent–based schemes, provided standard Lipschitz and boundedness assumptions hold. Experiments on CT and MNIST-like data illustrate the bound’s $\mathcal{O}(1/\sqrt{N})$ behavior and demonstrate practical estimation of the constants involved, though the bound is not guaranteed to be tight. Overall, the paper advances theoretical understanding of generalization in implicit networks and suggests avenues for integrating such models as differentiable layers within larger systems.
Abstract
Implicit networks are a class of neural networks whose outputs are defined by the fixed point of a parameterized operator. They have enjoyed success in many applications including natural language processing, image processing, and numerous other applications. While they have found abundant empirical success, theoretical work on its generalization is still under-explored. In this work, we consider a large family of implicit networks defined parameterized contractive fixed point operators. We show a generalization bound for this class based on a covering number argument for the Rademacher complexity of these architectures.
