Nearly-tight VC-dimension and pseudodimension bounds for piecewise linear neural networks

Peter L. Bartlett; Nick Harvey; Chris Liaw; Abbas Mehrabian

Nearly-tight VC-dimension and pseudodimension bounds for piecewise linear neural networks

Peter L. Bartlett, Nick Harvey, Chris Liaw, Abbas Mehrabian

TL;DR

The paper analyzes the VC-dimension and pseudodimension of deep neural networks with piecewise-linear and piecewise-polynomial activations, focusing on how depth, width, and the number of nonlinear units affect capacity. It introduces a refined bit-extraction construction to achieve a nearly tight Ω(WL log(W/L)) lower bound and provides unified upper bounds that scale as O(WL log W) for piecewise-linear activations and O(WU) in terms of nonlinear units, with a general O(WU log((d+1)p)) bound for piecewise-polynomial activations. The results illuminate depth-dependent capacity, showing near-constant dependence for piecewise-constant, linear dependence for piecewise-linear, and at most quadratic dependence for piecewise-polynomial activations, thereby clarifying the role of depth in generalization potential. These bounds unify and extend prior work, offering precise capacity characterizations across activation types and network architectures.

Abstract

We prove new upper and lower bounds on the VC-dimension of deep neural networks with the ReLU activation function. These bounds are tight for almost the entire range of parameters. Letting $W$ be the number of weights and $L$ be the number of layers, we prove that the VC-dimension is $O(W L \log(W))$, and provide examples with VC-dimension $Ω( W L \log(W/L) )$. This improves both the previously known upper bounds and lower bounds. In terms of the number $U$ of non-linear units, we prove a tight bound $Θ(W U)$ on the VC-dimension. All of these bounds generalize to arbitrary piecewise linear activation functions, and also hold for the pseudodimensions of these function classes. Combined with previous results, this gives an intriguing range of dependencies of the VC-dimension on depth for networks with different non-linearities: there is no dependence for piecewise-constant, linear dependence for piecewise-linear, and no more than quadratic dependence for general piecewise-polynomial.

Nearly-tight VC-dimension and pseudodimension bounds for piecewise linear neural networks

TL;DR

Abstract

We prove new upper and lower bounds on the VC-dimension of deep neural networks with the ReLU activation function. These bounds are tight for almost the entire range of parameters. Letting

be the number of weights and

be the number of layers, we prove that the VC-dimension is

, and provide examples with VC-dimension

. This improves both the previously known upper bounds and lower bounds. In terms of the number

of non-linear units, we prove a tight bound

on the VC-dimension. All of these bounds generalize to arbitrary piecewise linear activation functions, and also hold for the pseudodimensions of these function classes. Combined with previous results, this gives an intriguing range of dependencies of the VC-dimension on depth for networks with different non-linearities: there is no dependence for piecewise-constant, linear dependence for piecewise-linear, and no more than quadratic dependence for general piecewise-polynomial.

Nearly-tight VC-dimension and pseudodimension bounds for piecewise linear neural networks

TL;DR

Abstract

Nearly-tight VC-dimension and pseudodimension bounds for piecewise linear neural networks

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (17)