Robust Online Sampling from Possibly Moving Target Distributions
François Clément, Stefan Steinerberger
TL;DR
This work introduces a simple, robust greedy scheme for online sampling that adds points to an existing 1D point set to uniformly approximate a target measure $\mu$, or its CDF-transformed version, with a key update $x_{n+1}=k/(n+1)$. By analyzing a continuous mean-field limit, the authors derive a functional $E(\mu)=\int_0^1 |\Phi^{-1}(t)-t|dt$ whose minimizers are fixed points of $\Phi$, and show the discrete process inherits a structure that avoids spurious local minima and yields fast $O(n)$ computation per step. They connect the discrete energy to the Wasserstein-1 distance and Halász discrepancy bounds, provide explicit proofs and constructions (including a counterexample demonstrating non-monotone energy steps), and discuss related periodic-energy sequences with strong numerical performance. The approach extends naturally to general distributions via the CDF, with practical demonstrations illustrating uniformity improvements and competitive performance against classical quasi-random sequences. The results have implications for sequential sampling, Bayesian settings with unknown budgets, and discrepancy-optimal point distributions in one dimension.
Abstract
We suppose we are given a list of points $x_1, \dots, x_n \in \mathbb{R}$, a target probability measure $μ$ and are asked to add additional points $x_{n+1}, \dots, x_{n+m}$ so that $x_1, \dots, x_{n+m}$ is as close as possible to the distribution of $μ$; additionally, we want this to be true uniformly for all $m$. We propose a simple method that achieves this goal. It selects new points in regions where the existing set is lacking points and avoids regions that are already overly crowded. If we replace $μ$ by another measure $μ_2$ in the middle of the computation, the method dynamically adjusts and allows us to keep the original sampling points. $x_{n+1}$ can be computed in $\mathcal{O}(n)$ steps and we obtain state-of-the-art results. It appears to be an interesting dynamical system in its own right; we analyze a continuous mean-field version that reflects much of the same behavior.
