The Sample Complexity of Approximate Rejection Sampling with Applications to Smoothed Online Learning
Adam Block, Yury Polyanskiy
TL;DR
This paper studies the problem of drawing a sample from a target distribution $\nu$ using $n$ i.i.d. samples from a base distribution $\mu$ under a general $f$-divergence constraint $D_f(\nu \|\mu) \le D$. It establishes near-tight upper and lower bounds on the required sample size, showing that a modified rejection sampler achieves $\mathrm{TV}(P_{X_{j^*}}, \nu) \le \varepsilon$ when $n \ge \frac{2}{1-\varepsilon} \log(\frac{2}{\varepsilon}) (f')^{-1}(\frac{4 D_f(\nu \|\mu)}{\varepsilon}) \lor 2$, with linear $f'$ making approximate sampling impossible and superlinear $f'$ enabling tightness up to polylog factors. The results connect to smoothed online learning by introducing $f$-smoothed adversaries and deriving minimax regret bounds; Renyi-smoothed settings yield rates close to the known bounds as $\lambda$ grows, while KL-smoothed adversaries incur slower, $T^{2/3}$-type rates. The paper also develops oracle-efficient algorithms that preserve no-regret under $f$-smoothed constraints and compares sampling strategies for mean estimation across function classes. Overall, it provides a unified information-theoretic treatment of sampling under $f$-divergence constraints with broad implications for online learning and robust statistics.
Abstract
Suppose we are given access to $n$ independent samples from distribution $μ$ and we wish to output one of them with the goal of making the output distributed as close as possible to a target distribution $ν$. In this work we show that the optimal total variation distance as a function of $n$ is given by $\tildeΘ(\frac{D}{f'(n)})$ over the class of all pairs $ν,μ$ with a bounded $f$-divergence $D_f(ν\|μ)\leq D$. Previously, this question was studied only for the case when the Radon-Nikodym derivative of $ν$ with respect to $μ$ is uniformly bounded. We then consider an application in the seemingly very different field of smoothed online learning, where we show that recent results on the minimax regret and the regret of oracle-efficient algorithms still hold even under relaxed constraints on the adversary (to have bounded $f$-divergence, as opposed to bounded Radon-Nikodym derivative). Finally, we also study efficacy of importance sampling for mean estimates uniform over a function class and compare importance sampling with rejection sampling.
