Beyond Monte Carlo: Harnessing Diffusion Models to Simulate Financial Market Dynamics
Andrew Lesniewski, Giulio Trigila
TL;DR
The paper tackles generating high-fidelity synthetic financial market data using diffusion models that encode observed returns into a forward diffusion and decode via a learned reverse-time diffusion. It introduces a score-matching framework, notably a denoising score matching approach, to efficiently train a noise-conditioned score network without resorting to Monte Carlo simulations, and demonstrates how Gauss-Hermite quadrature enables accurate, scalable evaluation of the training objective. The key contributions include a practical two-stage diffusion framework, a neural score model with a tailored DSM objective, and empirical validation showing tail fidelity and improved covariance conditioning in synthetic data across diverse market regimes. This approach enables large-scale, tail-accurate scenario generation with potential benefits for portfolio optimization, risk quantification, and backtesting in finance.
Abstract
We propose a highly efficient and accurate methodology for generating synthetic financial market data using a diffusion model approach. The synthetic data produced by our methodology align closely with observed market data in several key aspects: (i) they pass the two-sample Cramer - von Mises test for portfolios of assets, and (ii) Q - Q plots demonstrate consistency across quantiles, including in the tails, between observed and generated market data. Moreover, the covariance matrices derived from a large set of synthetic market data exhibit significantly lower condition numbers compared to the estimated covariance matrices of the observed data. This property makes them suitable for use as regularized versions of the latter. For model training, we develop an efficient and fast algorithm based on numerical integration rather than Monte Carlo simulations. The methodology is tested on a large set of equity data.
