Efficient Neural SDE Training using Wiener-Space Cubature

Luke Snow; Vikram Krishnamurthy

Efficient Neural SDE Training using Wiener-Space Cubature

Luke Snow, Vikram Krishnamurthy

TL;DR

This paper addresses scalable training of neural stochastic differential equations by replacing Monte Carlo gradient estimation with Wiener-space cubature, extending cubature theory to nonlinear path-functionals and enabling deterministic, GPU-friendly ODE evaluations. It develops a Stratonovich reformulation, constructs cubature paths and weights, and proves non-asymptotic error bounds for nonlinear loss functionals. A high-order recombination algorithm drastically reduces the number of required ODE solves, achieving effective $\mathcal{O}(n^{-1})$ convergence under suitable parameter choices and providing concrete pre-processing complexity. Numerical studies show faster convergence and substantial wall-clock and memory savings compared to SDE Monte Carlo, across varying dimensions and architectures. The framework offers a principled, efficient approach to neural SDE training with potential broader impact in stochastic modeling and high-dimensional inference.

Abstract

A neural stochastic differential equation (SDE) is an SDE with drift and diffusion terms parametrized by neural networks. The training procedure for neural SDEs consists of optimizing the SDE vector field (neural network) parameters to minimize the expected value of an objective functional on infinite-dimensional path-space. Existing training techniques focus on methods to efficiently compute path-wise gradients of the objective functional with respect to these parameters, then pair this with Monte-Carlo simulation to estimate the gradient expectation. In this work we introduce a novel training technique which bypasses and improves upon this Monte-Carlo simulation; we extend results in the theory of Wiener space cubature to approximate the expected objective functional value by a weighted sum of functional evaluations of deterministic ODE solutions. Our main mathematical contribution enabling this approximation is an extension of cubature bounds to the setting of Lipschitz-nonlinear functionals acting on path-space. Our resulting constructive algorithm allows for more computationally efficient training along several lines. First, it circumvents Brownian motion simulation and enables the use of efficient parallel ODE solvers, thus decreasing the complexity of path-functional evaluation. Furthermore, and more surprisingly, we show that the number of paths required to achieve a given (expected loss functional oracle value) approximation can be reduced in this deterministic cubature regime. Specifically, we show that under reasonable regularity assumptions we can observe a O(1/n) convergence rate, where n is the number of path evaluations; in contrast with the standard O(1/sqrt(n)) rate of naive Monte-Carlo or the O(log(n)^d /n) rate of quasi-Monte-Carlo.

Efficient Neural SDE Training using Wiener-Space Cubature

TL;DR

Abstract

Efficient Neural SDE Training using Wiener-Space Cubature

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (16)