Table of Contents
Fetching ...

Integral Signatures of Activation Functions: A 9-Dimensional Taxonomy and Stability Theory for Deep Learning

Ankur Mali, Lawrence Hall, Jake Williams, Gordon Richards

TL;DR

The paper introduces a nine‑dimensional integral signature $\mathcal{S}_\sigma(\phi)=(m_1,g_1,g_2,m_2,\eta,\alpha_+,\alpha_-,\mathrm{TV}(\phi'),C(\phi))$ to unify Gaussian propagation, asymptotic growth, and regularity of activation functions. It proves well‑posedness, affine reparameterization laws, closure under bounded slope variation, and links these coordinates to Lyapunov stability and kernel conditioning, yielding principled design guidance. By applying the framework to ReLU, leaky‑ReLU, tanh, sigmoid, Swish, GELU, Mish, and TeLU, the paper provides sharp distinctions between saturating, linear‑growth, and smooth families and validates predictions with Gauss‑Hermite and Monte Carlo experiments. The work offers a practical, provable framework for activation design, aiming to move beyond heuristic choices toward stability‑ and kernel‑aware selections in deep networks.

Abstract

Activation functions govern the expressivity and stability of neural networks, yet existing comparisons remain largely heuristic. We propose a rigorous framework for their classification via a nine-dimensional integral signature S_sigma(phi), combining Gaussian propagation statistics (m1, g1, g2, m2, eta), asymptotic slopes (alpha_plus, alpha_minus), and regularity measures (TV(phi'), C(phi)). This taxonomy establishes well-posedness, affine reparameterization laws with bias, and closure under bounded slope variation. Dynamical analysis yields Lyapunov theorems with explicit descent constants and identifies variance stability regions through (m2', g2). From a kernel perspective, we derive dimension-free Hessian bounds and connect smoothness to bounded variation of phi'. Applying the framework, we classify eight standard activations (ReLU, leaky-ReLU, tanh, sigmoid, Swish, GELU, Mish, TeLU), proving sharp distinctions between saturating, linear-growth, and smooth families. Numerical Gauss-Hermite and Monte Carlo validation confirms theoretical predictions. Our framework provides principled design guidance, moving activation choice from trial-and-error to provable stability and kernel conditioning.

Integral Signatures of Activation Functions: A 9-Dimensional Taxonomy and Stability Theory for Deep Learning

TL;DR

The paper introduces a nine‑dimensional integral signature to unify Gaussian propagation, asymptotic growth, and regularity of activation functions. It proves well‑posedness, affine reparameterization laws, closure under bounded slope variation, and links these coordinates to Lyapunov stability and kernel conditioning, yielding principled design guidance. By applying the framework to ReLU, leaky‑ReLU, tanh, sigmoid, Swish, GELU, Mish, and TeLU, the paper provides sharp distinctions between saturating, linear‑growth, and smooth families and validates predictions with Gauss‑Hermite and Monte Carlo experiments. The work offers a practical, provable framework for activation design, aiming to move beyond heuristic choices toward stability‑ and kernel‑aware selections in deep networks.

Abstract

Activation functions govern the expressivity and stability of neural networks, yet existing comparisons remain largely heuristic. We propose a rigorous framework for their classification via a nine-dimensional integral signature S_sigma(phi), combining Gaussian propagation statistics (m1, g1, g2, m2, eta), asymptotic slopes (alpha_plus, alpha_minus), and regularity measures (TV(phi'), C(phi)). This taxonomy establishes well-posedness, affine reparameterization laws with bias, and closure under bounded slope variation. Dynamical analysis yields Lyapunov theorems with explicit descent constants and identifies variance stability regions through (m2', g2). From a kernel perspective, we derive dimension-free Hessian bounds and connect smoothness to bounded variation of phi'. Applying the framework, we classify eight standard activations (ReLU, leaky-ReLU, tanh, sigmoid, Swish, GELU, Mish, TeLU), proving sharp distinctions between saturating, linear-growth, and smooth families. Numerical Gauss-Hermite and Monte Carlo validation confirms theoretical predictions. Our framework provides principled design guidance, moving activation choice from trial-and-error to provable stability and kernel conditioning.

Paper Structure

This paper contains 63 sections, 32 theorems, 122 equations, 4 tables.

Key Result

Lemma 2.1

Let $Z=\sigma G$ with $G\sim\mathcal{N}(0,1)$. If $\phi$ has at most polynomial growth and $\phi'$ is locally integrable, then Moreover, $m_2(\sigma)=\mathbb{E}[\phi(\sigma G)^2]$ is differentiable with

Theorems & Definitions (81)

  • Definition 2.1: Lebesgue $L^p$ spaces
  • Definition 2.2: Gaussian measure and $L^p(\gamma_\sigma)$
  • Remark 2.1: Lebesgue vs. Gaussian integrability
  • Definition 2.3: Gaussian moment signatures
  • Definition 2.4: Derivative and mixed signatures
  • Lemma 2.1: Gaussian integration by parts and differentiation
  • proof
  • Remark 2.2: On the role of moment sequences
  • Definition 2.5: Polynomial growth exponent
  • Definition 2.6: Asymptotic linear slopes
  • ...and 71 more