Table of Contents
Fetching ...

Complexity of One-Dimensional ReLU DNNs

Jonathan Kogan, Hayden Jananthan, Jeremy Kepner

TL;DR

This work analyzes the expressivity of one-dimensional ReLU DNNs by counting linear regions in the infinite-width regime with He initialization and nonzero biases. It proves that the expected number of linear regions scales as the sum of layer widths plus lower-order terms, establishing a precise asymptotic relationship via breakpoints, and introduces a function-adaptive notion of sparsity based on region usage relative to the minimal necessary complexity. The authors develop a Gaussian-process framework with covariance recursions to count breakpoints and show that breakpoints generated in each layer propagate to the output with high probability as widths grow. The region-adaptive sparsity concept ties network sparsity to expressivity, offering an architecture-agnostic measure grounded in approximation complexity rather than parameter counts.

Abstract

We study the expressivity of one-dimensional (1D) ReLU deep neural networks through the lens of their linear regions. For randomly initialized, fully connected 1D ReLU networks (He scaling with nonzero bias) in the infinite-width limit, we prove that the expected number of linear regions grows as $\sum_{i = 1}^L n_i + \mathop{o}\left(\sum_{i = 1}^L{n_i}\right) + 1$, where $n_\ell$ denotes the number of neurons in the $\ell$-th hidden layer. We also propose a function-adaptive notion of sparsity that compares the expected regions used by the network to the minimal number needed to approximate a target within a fixed tolerance.

Complexity of One-Dimensional ReLU DNNs

TL;DR

This work analyzes the expressivity of one-dimensional ReLU DNNs by counting linear regions in the infinite-width regime with He initialization and nonzero biases. It proves that the expected number of linear regions scales as the sum of layer widths plus lower-order terms, establishing a precise asymptotic relationship via breakpoints, and introduces a function-adaptive notion of sparsity based on region usage relative to the minimal necessary complexity. The authors develop a Gaussian-process framework with covariance recursions to count breakpoints and show that breakpoints generated in each layer propagate to the output with high probability as widths grow. The region-adaptive sparsity concept ties network sparsity to expressivity, offering an architecture-agnostic measure grounded in approximation complexity rather than parameter counts.

Abstract

We study the expressivity of one-dimensional (1D) ReLU deep neural networks through the lens of their linear regions. For randomly initialized, fully connected 1D ReLU networks (He scaling with nonzero bias) in the infinite-width limit, we prove that the expected number of linear regions grows as , where denotes the number of neurons in the -th hidden layer. We also propose a function-adaptive notion of sparsity that compares the expected regions used by the network to the minimal number needed to approximate a target within a fixed tolerance.

Paper Structure

This paper contains 13 sections, 9 theorems, 42 equations.

Key Result

Theorem 4.2

Theorems & Definitions (22)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Definition 3.1: minimal linear complexity
  • Definition 3.2: $(f, \varepsilon_0, \alpha, c)$-region-adaptive sparsity
  • Definition 4.1
  • Theorem 4.2
  • Proposition 5.1
  • proof
  • Corollary 5.2
  • ...and 12 more