Designing Universal Causal Deep Learning Models: The Case of Infinite-Dimensional Dynamical Systems from Stochastic Analysis
Luca Galimberti, Anastasis Kratsios, Giulia Livieri
TL;DR
The paper addresses learning causal operators acting on infinite-dimensional spaces, a regime common in stochastic analysis, by introducing Causal Neural Operators that couple neural filters with a hypernetwork to preserve temporal causality. It provides two main universal approximation results: a static theorem showing neural filters can approximate Hölder or smooth trace-class operators between Fréchet spaces on compact sets, and a dynamic theorem showing causal maps with memory can be uniformly approximated with a finite, well-characterized hypernetwork. In finite dimensions, CNOs reduce to RNNs and the authors show that causal learning can be more parameter-efficient than standard FFNNs, offering super-optimal rates for causal dynamics. The work unifies approximation theory, functional analysis, and stochastic analysis to deliver a principled, scalable framework for operator learning in infinite-dimensional settings with potential applications to SDE solution operators and related financial models.
Abstract
Several non-linear operators in stochastic analysis, such as solution maps to stochastic differential equations, depend on a temporal structure which is not leveraged by contemporary neural operators designed to approximate general maps between Banach space. This paper therefore proposes an operator learning solution to this open problem by introducing a deep learning model-design framework that takes suitable infinite-dimensional linear metric spaces, e.g. Banach spaces, as inputs and returns a universal \textit{sequential} deep learning model adapted to these linear geometries specialized for the approximation of operators encoding a temporal structure. We call these models \textit{Causal Neural Operators}. Our main result states that the models produced by our framework can uniformly approximate on compact sets and across arbitrarily finite-time horizons Hölder or smooth trace class operators, which causally map sequences between given linear metric spaces. Our analysis uncovers new quantitative relationships on the latent state-space dimension of Causal Neural Operators, which even have new implications for (classical) finite-dimensional Recurrent Neural Networks. In addition, our guarantees for recurrent neural networks are tighter than the available results inherited from feedforward neural networks when approximating dynamical systems between finite-dimensional spaces.
