Divide, Interact, Sample: The Two-System Paradigm
James Chok, Myung Won Lee, Daniel Paulin, Geoffrey M. Vasil
TL;DR
The paper addresses the challenge of efficiently sampling high-dimensional distributions by unifying ensemble-chain, mean-field, and adaptive MCMC within a single two-system framework that pairs two interacting subsystems to propose updates for one another while preserving the target distribution $ ho$. By deriving two-system versions of overdamped and underdamped Langevin samplers (MALA and MAKLA) and providing both continuous- and discrete-time realizations, the authors enable parallel, MH-corrected updates with reduced computational costs relative to traditional ensemble methods. Extensive experiments on synthetic targets and posteriordb benchmarks show that adaptive two-system MAKLA variants achieve order-of-magnitude improvements in effective sample size per gradient evaluation compared to NUTS, and maintain robust performance across dimensions, including high-dimensional problems. The framework also clarifies the connections between ensemble, mean-field, and adaptive approaches, offering practical algorithms with strong theoretical guarantees and a scalable path to high-throughput Bayesian computation. The authors release open-source implementations to facilitate adoption and replication of their results.
Abstract
Mean-field, ensemble-chain, and adaptive samplers have historically been viewed as distinct approaches to Monte Carlo sampling. In this paper, we present a unifying {two-system} framework that brings all three under one roof. In our approach, an ensemble of particles is split into two interacting subsystems that propose updates for each other in a symmetric, alternating fashion. This cross-system interaction ensures that the overall ensemble has $ρ(x)$ as its invariant distribution in both the finite-particle setting and the mean-field limit. The two-system construction reveals that ensemble-chain samplers can be interpreted as finite-$N$ approximations of an ideal mean-field sampler; conversely, it provides a principled recipe to discretize mean-field Langevin dynamics into tractable parallel MCMC algorithms. The framework also connects naturally to adaptive single-chain methods: by replacing particle-based statistics with time-averaged statistics from a single chain, one recovers analogous adaptive dynamics in the long-time limit without requiring a large ensemble. We derive novel two-system versions of both overdamped and underdamped Langevin MCMC samplers within this paradigm. Across synthetic benchmarks and real-world posterior inference tasks, these two-system samplers exhibit significant performance gains over the popular No-U-Turn Sampler, achieving an order of magnitude higher effective sample sizes per gradient evaluation.
