Low-dimensional approximations of the conditional law of Volterra processes: a non-positive curvature approach

Reza Arabpour; John Armstrong; Luca Galimberti; Anastasis Kratsios; Giulia Livieri

Low-dimensional approximations of the conditional law of Volterra processes: a non-positive curvature approach

Reza Arabpour, John Armstrong, Luca Galimberti, Anastasis Kratsios, Giulia Livieri

TL;DR

The paper addresses estimating the conditional evolution of non-Markovian Volterra processes with stochastic volatility by projecting the conditional law onto a low-dimensional, non-positively curved (NPC) manifold of non-singular Gaussians, \mathcal{N}_d, endowed with a novel NPC perturbation of the Fisher geometry. A sequential geometric DL framework, the Hypergeometric Network (HGN), then approximates the projected dynamics on \mathcal{N}_d, leveraging a gating-style hypernetwork to synchronize a sequence of expert GDNs tied to specific times, thereby circumventing backpropagation through time. The authors establish universal approximation results for both static and dynamic settings with quantitative rates, show memory-decay properties of the projection in terms of the Volterra kernel, and provide extensive ablations validating the model and highlighting the role of curvature, memory, and kernel decay. The framework yields a tractable, scalable approach to learn measure-valued dynamics in high dimensions, with practical implications for forecasting conditional laws in financial modeling and related stochastic systems. Overall, the work combines differential geometry, measure-valued statistics, and geometric deep learning to enable low-dimensional yet expressive approximations of infinite-dimensional conditional laws.

Abstract

Predicting the conditional evolution of Volterra processes with stochastic volatility is a crucial challenge in mathematical finance. While deep neural network models offer promise in approximating the conditional law of such processes, their effectiveness is hindered by the curse of dimensionality caused by the infinite dimensionality and non-smooth nature of these problems. To address this, we propose a two-step solution. Firstly, we develop a stable dimension reduction technique, projecting the law of a reasonably broad class of Volterra process onto a low-dimensional statistical manifold of non-positive sectional curvature. Next, we introduce a sequentially deep learning model tailored to the manifold's geometry, which we show can approximate the projected conditional law of the Volterra process. Our model leverages an auxiliary hypernetwork to dynamically update its internal parameters, allowing it to encode non-stationary dynamics of the Volterra process, and it can be interpreted as a gating mechanism in a mixture of expert models where each expert is specialized at a specific point in time. Our hypernetwork further allows us to achieve approximation rates that would seemingly only be possible with very large networks.

Low-dimensional approximations of the conditional law of Volterra processes: a non-positive curvature approach

TL;DR

Abstract

Paper Structure (53 sections, 25 theorems, 167 equations, 6 figures, 14 tables, 3 algorithms)

This paper contains 53 sections, 25 theorems, 167 equations, 6 figures, 14 tables, 3 algorithms.

Introduction
Outline
Related Work
Geometric Background
Problem Setting
A Globally NPC perturbation of the Fisher geometry for Gaussian measures
Comparison with other geometries on Nd
Gaussian random projections
Intuitive idea
Encoding the conditional distribution of a Volterra process in Nd
Memory decay of the Gaussian Projection operation given a decaying Volterra kernel
Universal approximation of manifold-valued processes
Standing Assumptions for Universal Approximation Results
Static Case
Dynamic/Sequential universality
...and 38 more sections

Key Result

Proposition 2.2

Let $(N, g)$ be a Riemannian manifold and let $d$ be its induced Riemannian distance. Then $(N, d)$ is a global NPC space if and only if it is complete, simply connected and of non-positive (sectional) curvature.

Figures (6)

Figure 1: The GDN Model: GDN model process an input in $x_{[-M:0]}\in \mathcal{N}^{1+\mathop{\mathrm{H}}\nolimits}$, interpreted as sequential points $x_{-M},\dots,x_0$ inputs in $\mathcal{M}$, in three steps: an encoding, transformation, and decoding phase. First, it linearized (purple) the inputs in $\mathcal{N}^{1+\mathop{\mathrm{H}}\nolimits}$ along products of geodesics emanating from a set of reference points $x_0^{\star},\dots,x^{\star}_M$ in $\mathcal{N}^{1+\mathop{\mathrm{H}}\nolimits}$. It then transforms the linearized features and maps them to a vector $v$ in the tangent space of $\mathcal{M}$ using a standard ReLU-MLP (yellow). In the decoding phase (green), the model maps $v$ to a point $\hat{f}(x_{[-M:0]})$ on $\mathcal{M}$ by travelling geodesics in $\mathcal{M}$ emanating from a reference point $y^{\star}$ therein with initial velocity $v$.
Figure 2: The HGN Model: The green layer encodes sequence segments in the input manifold into distances relative to a reference/landmark point $x^{\star}$ therein. These linearized features are then processed through a ReLU MLP, illustrated by the yellow repeated applying fully-connected affine (also called linear) layers interspersed with ReLU activation functions orange. Finally, the purple decodes the vector $v$ generated by the downsampled CNN into a manifold-valued prediction, by travelling along a geodesic emanating from a reference/landmark point $y^{\star}$ therein with initial velocity $v$. The HGN Model: Applies the GDN model while iteratively updating its internal parameters, at each time step, using an (blue) auxiliary ReLU network, called a hypernetwork.
Figure 3: Situation I - Nearly Logarithmic Degradation of HGN Accuracy: The HGN performance slowly (logarithmically) departs from that of the GDN as time rolls forward. This is typically what is observed in most of our experiments.
Figure 4: Situation II - Nearly Perfect GDN Prediction by HGN: The HGN continues to nearly perfectly predict the performance of the GDN as time rolls forward. This occurs in a subset of experiments where the GDN parameters do not change significantly between time steps.
Figure 5: Diagram chase in the proof of Theorem \ref{['theorem:optimal_GDN_Rates__ReLUActivation']}.
...and 1 more figures

Theorems & Definitions (34)

Definition 2.1: NPC space
Proposition 2.2: Manifolds (Proposition 3.1 in Sturm_2003
Proposition 2.3: Existence of barycenters (Proposition 4.3 in Sturm_2003)
Theorem 2.4: Fundamental contraction property; Sturm_2003)
Remark 3.2
Example 1: Stochastic delay differential equations
Proposition 3.3: The geometry of $(\mathcal{N}_d, \mathfrak{I})$
Definition 4.1: Gaussian random projection of $X_{\cdot}$
Theorem 4.2: Optimality and Lipschitz Stability of the Gaussian Random Projection
Proposition 4.3: Smoothness
...and 24 more

Low-dimensional approximations of the conditional law of Volterra processes: a non-positive curvature approach

TL;DR

Abstract

Low-dimensional approximations of the conditional law of Volterra processes: a non-positive curvature approach

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (34)