Designing Universal Causal Deep Learning Models: The Case of Infinite-Dimensional Dynamical Systems from Stochastic Analysis

Luca Galimberti; Anastasis Kratsios; Giulia Livieri

Designing Universal Causal Deep Learning Models: The Case of Infinite-Dimensional Dynamical Systems from Stochastic Analysis

Luca Galimberti, Anastasis Kratsios, Giulia Livieri

TL;DR

The paper addresses learning causal operators acting on infinite-dimensional spaces, a regime common in stochastic analysis, by introducing Causal Neural Operators that couple neural filters with a hypernetwork to preserve temporal causality. It provides two main universal approximation results: a static theorem showing neural filters can approximate Hölder or smooth trace-class operators between Fréchet spaces on compact sets, and a dynamic theorem showing causal maps with memory can be uniformly approximated with a finite, well-characterized hypernetwork. In finite dimensions, CNOs reduce to RNNs and the authors show that causal learning can be more parameter-efficient than standard FFNNs, offering super-optimal rates for causal dynamics. The work unifies approximation theory, functional analysis, and stochastic analysis to deliver a principled, scalable framework for operator learning in infinite-dimensional settings with potential applications to SDE solution operators and related financial models.

Abstract

Several non-linear operators in stochastic analysis, such as solution maps to stochastic differential equations, depend on a temporal structure which is not leveraged by contemporary neural operators designed to approximate general maps between Banach space. This paper therefore proposes an operator learning solution to this open problem by introducing a deep learning model-design framework that takes suitable infinite-dimensional linear metric spaces, e.g. Banach spaces, as inputs and returns a universal \textit{sequential} deep learning model adapted to these linear geometries specialized for the approximation of operators encoding a temporal structure. We call these models \textit{Causal Neural Operators}. Our main result states that the models produced by our framework can uniformly approximate on compact sets and across arbitrarily finite-time horizons Hölder or smooth trace class operators, which causally map sequences between given linear metric spaces. Our analysis uncovers new quantitative relationships on the latent state-space dimension of Causal Neural Operators, which even have new implications for (classical) finite-dimensional Recurrent Neural Networks. In addition, our guarantees for recurrent neural networks are tighter than the available results inherited from feedforward neural networks when approximating dynamical systems between finite-dimensional spaces.

Designing Universal Causal Deep Learning Models: The Case of Infinite-Dimensional Dynamical Systems from Stochastic Analysis

TL;DR

Abstract

Paper Structure (12 sections, 1 theorem, 13 equations, 4 figures)

This paper contains 12 sections, 1 theorem, 13 equations, 4 figures.

Introduction
Our contribution.
Our contribution in the Approximation Theory of Neural Operators.
Our contribution in the Approximation Theory of RNNs
Technical contributions:
Organization of our paper
Notation
Preliminaries
Fréchet spaces
Feedforward Neural Networks with ReLU and PReLU activation functions
Main Results
Static Case: Universal Approximation

Key Result

lemma thmcounterlemma

Let $E$ and $B$ be two Fréchet spaces. Let $f : E \rightarrow B$ be a (non-linear) operator between these two spaces which is $C^{k}$-$Dir.$ (see Subsection subsec:Frechet, below Equation eq:directional_derivative_order_k). Then, $f$ is $C^{k}$ stable as in Definition def:C_k_stability.

Figures (4)

Figure 1: The Causal Neural Operator Model: Summary: An universal approximator of regular causal sequences of operators between well-behaved Fréchet spaces. Overview: The model successively applies a "universal" neural filter (see Figure \ref{['fig:model_staticcase']}) on consecutive time-windows; the internal parameters of this neural filter evolve according to a latent dynamical system on the neural filter's parameter space; implemented by a deep ReLU network called a hypernetwork.
Figure 2: The Neural Filter Summary: An universal approximator of regular maps between any well-behaved Fréchet spaces. Overview: The neural filter first encodes inputs from a (possibly infinite-dimensional) linear space by approximately representing the input as coefficients of a sparse (Schauder) basis. These basis coefficients are then transformed by a deep ReLU network and the network's outputs are decoded by the coefficients of a sparse basis representation of an element of the output linear space. Assembling the basis using the outputted coefficients produces the neural filter's output.
Figure 3: Illustration of our "static" operator network in Definition \ref{['def:Neural_Filter']}. The network works in three phases. 1) First inputs are encoded as finite-dimensional Euclidean data by mapping them to their truncated (Schauder) basis coefficients in the input space $E$. 2) Next these coefficients are transformed by a ReLU FFNN. 3) The outputs of ReLU FFNN's output are interpreted as coefficients for a truncated (Schauder) basis in the output space $F$.
Figure 4: Pictorial representation of the fact that the indicator function of the interval $[0,1]$ belongs to $C^{k, \lambda}_{\operatorname{tr}}([0,1], \mathbb{R})$ for all $k \in \mathbb{N}$ and $\lambda > 0$ ; see Example \ref{['example:indicator']}.

Theorems & Definitions (8)

definition thmcounterdefinition: Fréchet space
definition thmcounterdefinition: Directional Derivative
remark thmcounterremark
remark thmcounterremark
definition thmcounterdefinition: $C^k$-Stability
lemma thmcounterlemma
proof
definition thmcounterdefinition: Trace Class $C^{k,\lambda}_{\operatorname{tr}}( K,B)$

Designing Universal Causal Deep Learning Models: The Case of Infinite-Dimensional Dynamical Systems from Stochastic Analysis

TL;DR

Abstract

Designing Universal Causal Deep Learning Models: The Case of Infinite-Dimensional Dynamical Systems from Stochastic Analysis

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (8)