Table of Contents
Fetching ...

Wide stable neural networks: Sample regularity, functional convergence and Bayesian inverse problems

Tomás Soto

TL;DR

This work analyzes wide neural networks with α-stable (heavy-tailed) weights and shows that, under appropriate scaling, the infinite-width limit exists as an α-stable process f^∞ whose sample paths lie in fractional Sobolev spaces $W^{s,p}(\mathcal{U})$. It proves functional convergence of the finite-width laws to the limit law on $W^{s,p}(\mathcal{U})$ and derives convergence of Bayesian posteriors for edge-preserving inverse problems using stable priors. The results extend to deep architectures under Lipschitz activations, via a recursive regularity argument, yielding discretization-invariance-type guarantees for both priors and posteriors. The framework thus provides a principled, functional-space treatment of non-Gaussian wide networks and their use in Bayesian inverse problems, with connections to Besov-type spaces and potential extensions to more general function spaces.

Abstract

We study the large-width asymptotics of random fully connected neural networks with weights drawn from $α$-stable distributions, a family of heavy-tailed distributions arising as the limiting distributions in the Gnedenko-Kolmogorov heavy-tailed central limit theorem. We show that in an arbitrary bounded Euclidean domain $\mathcal{U}$ with smooth boundary, the random field at the infinite-width limit, characterized in previous literature in terms of finite-dimensional distributions, has sample functions in the fractional Sobolev-Slobodeckij-type quasi-Banach function space $W^{s,p}(\mathcal{U})$ for integrability indices $p < α$ and suitable smoothness indices $s$ depending on the activation function of the neural network, and establish the functional convergence of the processes in the space of probability measures on $W^{s,p}(\mathcal{U})$. This convergence result is leveraged in the study of functional posteriors for edge-preserving Bayesian inverse problems with stable neural network priors.

Wide stable neural networks: Sample regularity, functional convergence and Bayesian inverse problems

TL;DR

This work analyzes wide neural networks with α-stable (heavy-tailed) weights and shows that, under appropriate scaling, the infinite-width limit exists as an α-stable process f^∞ whose sample paths lie in fractional Sobolev spaces . It proves functional convergence of the finite-width laws to the limit law on and derives convergence of Bayesian posteriors for edge-preserving inverse problems using stable priors. The results extend to deep architectures under Lipschitz activations, via a recursive regularity argument, yielding discretization-invariance-type guarantees for both priors and posteriors. The framework thus provides a principled, functional-space treatment of non-Gaussian wide networks and their use in Bayesian inverse problems, with connections to Besov-type spaces and potential extensions to more general function spaces.

Abstract

We study the large-width asymptotics of random fully connected neural networks with weights drawn from -stable distributions, a family of heavy-tailed distributions arising as the limiting distributions in the Gnedenko-Kolmogorov heavy-tailed central limit theorem. We show that in an arbitrary bounded Euclidean domain with smooth boundary, the random field at the infinite-width limit, characterized in previous literature in terms of finite-dimensional distributions, has sample functions in the fractional Sobolev-Slobodeckij-type quasi-Banach function space for integrability indices and suitable smoothness indices depending on the activation function of the neural network, and establish the functional convergence of the processes in the space of probability measures on . This convergence result is leveraged in the study of functional posteriors for edge-preserving Bayesian inverse problems with stable neural network priors.
Paper Structure (10 sections, 13 theorems, 92 equations, 2 figures)

This paper contains 10 sections, 13 theorems, 92 equations, 2 figures.

Key Result

Theorem 1.1

Let $\mathcal{U}$ be a bounded domain in $\mathbb{R}^d$ with $C^{\infty}$-smooth boundary (open interval in case $d = 1$). Assume that the activation function $\varphi$ is Hölder-continuous and uniformly sublinearly increasing, i.e. that there exist $\lambda \in (0,1]$ and $\beta \in [0,1)$ such tha for some constants $c_\varphi$, $c'_\varphi < \infty$. Assume that $\alpha \in (d/(d+\lambda),2)$.

Figures (2)

  • Figure 1: Sample paths of wide neural networks with one hidden layer on $[-1,1]$
  • Figure 2: Sample functions of wide neural networks with one hidden layer on $[-1,1]^2$

Theorems & Definitions (32)

  • Theorem 1.1
  • Remark 1.2
  • Remark 1.3
  • Remark 1.4
  • Proposition 2.1
  • Remark 2.2
  • proof
  • Proposition 2.3
  • Remark 2.4
  • proof
  • ...and 22 more