Deep Networks are Reproducing Kernel Chains

Tjeerd Jan Heeringa; Len Spek; Christoph Brune

Deep Networks are Reproducing Kernel Chains

Tjeerd Jan Heeringa, Len Spek, Christoph Brune

TL;DR

This work introduces chain Reproducing Kernel Banach Spaces (cRKBS) to model deep neural networks within a principled function-space framework. By composing kernels rather than functions across RKBS layers, the authors preserve RKBS properties and establish a duality-driven theory that tightly links deep networks to neural cRKBS, with rigorous finite-data representer results. The main contributions include a kernel-chaining construction, the specialization to integral and neural cRKBS, and a representer theorem guaranteeing sparse, weight-sharing networks with at most $N$ hidden units per layer and a parameter bound of $N(N+1)(L+1)$. The framework also connects to generalized Barron spaces and existing neural-network spaces, offering insights into geometric learning, generalization, and optimization, while providing a path toward broader applicability beyond standard architectures.

Abstract

Identifying an appropriate function space for deep neural networks remains a key open question. While shallow neural networks are naturally associated with Reproducing Kernel Banach Spaces (RKBS), deep networks present unique challenges. In this work, we extend RKBS to chain RKBS (cRKBS), a new framework that composes kernels rather than functions, preserving the desirable properties of RKBS. We prove that any deep neural network function is a neural cRKBS function, and conversely, any neural cRKBS function defined on a finite dataset corresponds to a deep neural network. This approach provides a sparse solution to the empirical risk minimization problem, requiring no more than $N$ neurons per layer, where $N$ is the number of data points.

Deep Networks are Reproducing Kernel Chains

TL;DR

hidden units per layer and a parameter bound of

. The framework also connects to generalized Barron spaces and existing neural-network spaces, offering insights into geometric learning, generalization, and optimization, while providing a path toward broader applicability beyond standard architectures.

Abstract

neurons per layer, where

is the number of data points.

Paper Structure (15 sections, 14 theorems, 86 equations, 3 figures)

This paper contains 15 sections, 14 theorems, 86 equations, 3 figures.

Introduction
Related work
Shallow neural networks
Deep neural networks
Our contribution
Chain Reproducing Kernel Banach Spaces
Reproducing Kernel Banach Spaces
Constructing a chain Reproducing Kernel Banach Space
Chain RKBS for Deep Neural Networks
Integral cRKBS
Neural cRKBS
Relation to other spaces
Representer Theorem: kernel chains enable weight sharing
Discussion
Summary and outlook

Key Result

Theorem 1

Every deep neural network of depth $L$ is an element of the neural cRKBS for depth $L$. Conversely, if we only consider $N$ data points, then all functions in a neural cRKBS are deep networks with at most $N$ hidden nodes per layer and all the weights, except the last layer, are shared.

Figures (3)

Figure 1: Deep networks are not compositions of shallow networks. (left) Function composition leads to undesired extra bottleneck layer $A^1$ in blue; (right) Kernel composition effectively matches layers directly.
Figure 2: Schematic representation of the construction of a kernel chain. The dependencies of the spaces have been indicated by arrows: a straight line when there is a map between the chain and link, and a dashed line for a domain identification.
Figure 3: Schematic representation of the construction of a neural RKBS chain, with biases omitted from the diagram. The dependencies of the spaces have been indicated by arrows: a straight line when there is a map between the chain and link, and a dashed line for a domain identification. The orange arrows are the extra dependencies compared to the general case depicted in Figure \ref{['fig:kernel_chain']}.

Theorems & Definitions (33)

Theorem 1
Definition 2
Theorem 3
Definition 4
Definition 5
Theorem 6
proof
Theorem 7: RKBS Consistency
proof
Definition 8
...and 23 more

Deep Networks are Reproducing Kernel Chains

TL;DR

Abstract

Deep Networks are Reproducing Kernel Chains

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (33)