Towards Lower Bounds on the Depth of ReLU Neural Networks
Christoph Hertrich, Amitabh Basu, Marco Di Summa, Martin Skutella
TL;DR
This work tackles the fundamental question of depth versus expressivity in ReLU networks by combining mixed-integer optimization, polyhedral geometry, and tropical geometry to study which piecewise linear functions are exactly representable as depth grows. It provides a conditional depth lower-bound via a MIP argument, establishes a strict depth hierarchy for k ≥ 2 by showing ReLU(k) strictly contains MAX(2^k), and develops polynomial-width bounds for representing CPWL functions through Newton polyhedra and convex–concave decompositions. A key contribution is linking neural-function representability to Newton polytopes and polyhedral complexes, enabling a geometric lens on depth and width that complements universal approximation results. Overall, the paper advances the theoretical understanding of depth-based expressivity limits and opens avenues for geometric and combinatorial proofs of depth lower bounds beyond small-scale MIP evidence.
Abstract
We contribute to a better understanding of the class of functions that can be represented by a neural network with ReLU activations and a given architecture. Using techniques from mixed-integer optimization, polyhedral theory, and tropical geometry, we provide a mathematical counterbalance to the universal approximation theorems which suggest that a single hidden layer is sufficient for learning any function. In particular, we investigate whether the class of exactly representable functions strictly increases by adding more layers (with no restrictions on size). As a by-product of our investigations, we settle an old conjecture about piecewise linear functions by Wang and Sun (2005) in the affirmative. We also present upper bounds on the sizes of neural networks required to represent functions with logarithmic depth.
