Table of Contents
Fetching ...

Embeddings between Barron spaces with higher order activation functions

Tjeerd Jan Heeringa, Len Spek, Felix Schwenninger, Christoph Brune

TL;DR

This work investigates how the choice of activation function influences Barron spaces that represent infinitely wide shallow networks. It introduces a push-forward measure framework to construct embeddings between Barron spaces associated with different activations, with a focus on rectified power units (RePU) and Lipschitz activations. It establishes a hierarchical structure $\mathcal{B}_{\mathrm{RePU}_t}\hookrightarrow\mathcal{B}_{\mathrm{RePU}_s}$ for $0\le s\le t$, and extends embeddings to broader activation families via linear combinations, convolutions, and derivatives, including a bridge from spectral Barron spaces to RePU-based Barron spaces using Fourier analysis and Taylor remainder arguments. These results provide a principled way to transfer approximation properties across activations and illuminate how activation design shapes the associated function spaces. The framework offers insights into when and how activation changes preserve or enhance representational capacity in the infinite-width regime, with potential implications for activation selection and transferability of Barron-space-based guarantees.

Abstract

The approximation properties of infinitely wide shallow neural networks heavily depend on the choice of the activation function. To understand this influence, we study embeddings between Barron spaces with different activation functions. These embeddings are proven by providing push-forward maps on the measures $μ$ used to represent functions $f$. An activation function of particular interest is the rectified power unit ($\operatorname{RePU}$) given by $\operatorname{RePU}_s(x)=\max(0,x)^s$. For many commonly used activation functions, the well-known Taylor remainder theorem can be used to construct a push-forward map, which allows us to prove the embedding of the associated Barron space into a Barron space with a $\operatorname{RePU}$ as activation function. Moreover, the Barron spaces associated with the $\operatorname{RePU}_s$ have a hierarchical structure similar to the Sobolev spaces $H^m$.

Embeddings between Barron spaces with higher order activation functions

TL;DR

This work investigates how the choice of activation function influences Barron spaces that represent infinitely wide shallow networks. It introduces a push-forward measure framework to construct embeddings between Barron spaces associated with different activations, with a focus on rectified power units (RePU) and Lipschitz activations. It establishes a hierarchical structure for , and extends embeddings to broader activation families via linear combinations, convolutions, and derivatives, including a bridge from spectral Barron spaces to RePU-based Barron spaces using Fourier analysis and Taylor remainder arguments. These results provide a principled way to transfer approximation properties across activations and illuminate how activation design shapes the associated function spaces. The framework offers insights into when and how activation changes preserve or enhance representational capacity in the infinite-width regime, with potential implications for activation selection and transferability of Barron-space-based guarantees.

Abstract

The approximation properties of infinitely wide shallow neural networks heavily depend on the choice of the activation function. To understand this influence, we study embeddings between Barron spaces with different activation functions. These embeddings are proven by providing push-forward maps on the measures used to represent functions . An activation function of particular interest is the rectified power unit () given by . For many commonly used activation functions, the well-known Taylor remainder theorem can be used to construct a push-forward map, which allows us to prove the embedding of the associated Barron space into a Barron space with a as activation function. Moreover, the Barron spaces associated with the have a hierarchical structure similar to the Sobolev spaces .
Paper Structure (14 sections, 14 theorems, 125 equations, 1 figure)

This paper contains 14 sections, 14 theorems, 125 equations, 1 figure.

Key Result

Theorem 1

Let $\psi$ and $\phi$ be Lipschitz activation functions. If for all $x\in\mathbb{R}$ and for some measure $\gamma\in \mathcal{M}(\mathbb{R}^2)$ satisfying then $\mathcal{B}_\phi\hookrightarrow\mathcal{B}_{\psi}$. Moreover,

Figures (1)

  • Figure 1: Each circle represents a neuron, and arrows represent connections between neurons. On the left, a network with $\phi$ as activation function representing $f$ is shown. The activation function $\phi$ can be represented using a shallow neural network with 3 neurons in the hidden layer and activation function $\psi$. On the right, a network with $\psi$ as activation function representing $f$ is shown. The network representing $\phi$ is used to construct the network on the right from that on the top left. Colors have been added to track which neuron on the right corresponds to which on the top left.

Theorems & Definitions (25)

  • Theorem 1
  • Remark
  • Proposition 2.1
  • proof
  • Lemma 2.1
  • Lemma 2.2
  • Proposition 2.2
  • proof
  • Proposition 2.3
  • proof
  • ...and 15 more