Approximating Langevin Monte Carlo with ResNet-like Neural Network architectures
Charles Miranda, Janina Schütte, David Sommer, Martin Eigel
TL;DR
This work studies how to approximate Langevin Monte Carlo sampling with a ResNet-like neural network architecture that maps samples from a simple reference distribution to a target distribution defined by a potential V. It analyzes the approximation quality in Wasserstein-2 distance under sub-Gaussianity of intermediate LMC measures and two drift-approximation regimes: global linear error growth and local Lipschitz constraints, proving complexity bounds that avoid the curse of dimensionality in favorable settings. The main contributions include formalizing a neural surrogate for the LMC drift, establishing uniform bounds on variance proxies, and proving that the resulting neural network can achieve arbitrary accuracy in sampling for smooth, strongly convex targets, supported by experiments on Gaussian, Gaussian mixture, and Darcy-PDE posteriors. The results provide a principled route to fast, scalable surrogate sampling in high dimensions, with practical potential for Bayesian inverse problems and uncertainty quantification where expensive forward solves constrain traditional MCMC.
Abstract
We sample from a given target distribution by constructing a neural network which maps samples from a simple reference, e.g. the standard normal distribution, to samples from the target. To that end, we propose using a neural network architecture inspired by the Langevin Monte Carlo (LMC) algorithm. Based on LMC perturbation results, we show approximation rates of the proposed architecture for smooth, log-concave target distributions measured in the Wasserstein-$2$ distance. The analysis heavily relies on the notion of sub-Gaussianity of the intermediate measures of the perturbed LMC process. In particular, we derive bounds on the growth of the intermediate variance proxies under different assumptions on the perturbations. Moreover, we propose an architecture similar to deep residual neural networks and derive expressivity results for approximating the sample to target distribution map.
