On Sampling with Approximate Transport Maps
Louis Grenioux, Alain Durmus, Éric Moulines, Marylou Gabrié
TL;DR
The paper studies sampling with approximate transport maps built from Normalizing Flows and compares three NF enhanced strategies: neural-IS, flow-MCMC, and neutra-MCMC. It shows that flow-based and flow-MCMC approaches excel at multimodal targets up to moderate dimensions, while neutra-MCMC is more robust for unimodal targets but struggles to traverse energy barriers between modes. A new mixing time bound for independent Metropolis-Hastings under a local Lipschitz condition on log weights and strong convexity is derived, revealing dimension-free behavior when the flow quality constant is controlled. Real-world benchmarks in molecular systems, sparse logistic regression, and field theory corroborate synthetic findings and highlight practical tradeoffs in wall-clock time and parallelizability. Overall, NF enabled samplers offer strong advantages when matched to target geometry, with hybrid strategies offering a compelling path for high dimensional multimodal inference.
Abstract
Transport maps can ease the sampling of distributions with non-trivial geometries by transforming them into distributions that are easier to handle. The potential of this approach has risen with the development of Normalizing Flows (NF) which are maps parameterized with deep neural networks trained to push a reference distribution towards a target. NF-enhanced samplers recently proposed blend (Markov chain) Monte Carlo methods with either (i) proposal draws from the flow or (ii) a flow-based reparametrization. In both cases, the quality of the learned transport conditions performance. The present work clarifies for the first time the relative strengths and weaknesses of these two approaches. Our study concludes that multimodal targets can be reliably handled with flow-based proposals up to moderately high dimensions. In contrast, methods relying on reparametrization struggle with multimodality but are more robust otherwise in high-dimensional settings and under poor training. To further illustrate the influence of target-proposal adequacy, we also derive a new quantitative bound for the mixing time of the Independent Metropolis-Hastings sampler.
