Table of Contents
Fetching ...

Deep Networks are Reproducing Kernel Chains

Tjeerd Jan Heeringa, Len Spek, Christoph Brune

TL;DR

This work introduces chain Reproducing Kernel Banach Spaces (cRKBS) to model deep neural networks within a principled function-space framework. By composing kernels rather than functions across RKBS layers, the authors preserve RKBS properties and establish a duality-driven theory that tightly links deep networks to neural cRKBS, with rigorous finite-data representer results. The main contributions include a kernel-chaining construction, the specialization to integral and neural cRKBS, and a representer theorem guaranteeing sparse, weight-sharing networks with at most $N$ hidden units per layer and a parameter bound of $N(N+1)(L+1)$. The framework also connects to generalized Barron spaces and existing neural-network spaces, offering insights into geometric learning, generalization, and optimization, while providing a path toward broader applicability beyond standard architectures.

Abstract

Identifying an appropriate function space for deep neural networks remains a key open question. While shallow neural networks are naturally associated with Reproducing Kernel Banach Spaces (RKBS), deep networks present unique challenges. In this work, we extend RKBS to chain RKBS (cRKBS), a new framework that composes kernels rather than functions, preserving the desirable properties of RKBS. We prove that any deep neural network function is a neural cRKBS function, and conversely, any neural cRKBS function defined on a finite dataset corresponds to a deep neural network. This approach provides a sparse solution to the empirical risk minimization problem, requiring no more than $N$ neurons per layer, where $N$ is the number of data points.

Deep Networks are Reproducing Kernel Chains

TL;DR

This work introduces chain Reproducing Kernel Banach Spaces (cRKBS) to model deep neural networks within a principled function-space framework. By composing kernels rather than functions across RKBS layers, the authors preserve RKBS properties and establish a duality-driven theory that tightly links deep networks to neural cRKBS, with rigorous finite-data representer results. The main contributions include a kernel-chaining construction, the specialization to integral and neural cRKBS, and a representer theorem guaranteeing sparse, weight-sharing networks with at most hidden units per layer and a parameter bound of . The framework also connects to generalized Barron spaces and existing neural-network spaces, offering insights into geometric learning, generalization, and optimization, while providing a path toward broader applicability beyond standard architectures.

Abstract

Identifying an appropriate function space for deep neural networks remains a key open question. While shallow neural networks are naturally associated with Reproducing Kernel Banach Spaces (RKBS), deep networks present unique challenges. In this work, we extend RKBS to chain RKBS (cRKBS), a new framework that composes kernels rather than functions, preserving the desirable properties of RKBS. We prove that any deep neural network function is a neural cRKBS function, and conversely, any neural cRKBS function defined on a finite dataset corresponds to a deep neural network. This approach provides a sparse solution to the empirical risk minimization problem, requiring no more than neurons per layer, where is the number of data points.
Paper Structure (15 sections, 14 theorems, 86 equations, 3 figures)

This paper contains 15 sections, 14 theorems, 86 equations, 3 figures.

Key Result

Theorem 1

Every deep neural network of depth $L$ is an element of the neural cRKBS for depth $L$. Conversely, if we only consider $N$ data points, then all functions in a neural cRKBS are deep networks with at most $N$ hidden nodes per layer and all the weights, except the last layer, are shared.

Figures (3)

  • Figure 1: Deep networks are not compositions of shallow networks. (left) Function composition leads to undesired extra bottleneck layer $A^1$ in blue; (right) Kernel composition effectively matches layers directly.
  • Figure 2: Schematic representation of the construction of a kernel chain. The dependencies of the spaces have been indicated by arrows: a straight line when there is a map between the chain and link, and a dashed line for a domain identification.
  • Figure 3: Schematic representation of the construction of a neural RKBS chain, with biases omitted from the diagram. The dependencies of the spaces have been indicated by arrows: a straight line when there is a map between the chain and link, and a dashed line for a domain identification. The orange arrows are the extra dependencies compared to the general case depicted in Figure \ref{['fig:kernel_chain']}.

Theorems & Definitions (33)

  • Theorem 1
  • Definition 2
  • Theorem 3
  • Definition 4
  • Definition 5
  • Theorem 6
  • proof
  • Theorem 7: RKBS Consistency
  • proof
  • Definition 8
  • ...and 23 more