Tuning Sequential Monte Carlo Samplers via Greedy Incremental Divergence Minimization
Kyurae Kim, Zuheng Xu, Jacob R. Gardner, Trevor Campbell
TL;DR
This work tackles the challenge of tuning path proposal kernels in sequential Monte Carlo, especially for unadjusted kernels where traditional MCMC tuning does not apply. It introduces a greedy, incremental KL divergence objective over SMC path measures and provides a gradient-free AdaptStepsize procedure to tune scalar kernel step sizes, with extensions to kinetic Langevin dynamics. The proposed framework enables adaptive SMC samplers (SMC-LMC and SMC-KLMC) that achieve lower variance in normalizing constant estimates and competitive performance compared with end-to-end optimization, while requiring far less computational effort. Empirical results across multiple benchmarks demonstrate robust adaptation, favorable scaling in high dimensions, and practical guidance for applying adaptive SMC in static models and diffusion-prior contexts.
Abstract
The performance of sequential Monte Carlo (SMC) samplers heavily depends on the tuning of the Markov kernels used in the path proposal. For SMC samplers with unadjusted Markov kernels, standard tuning objectives, such as the Metropolis-Hastings acceptance rate or the expected-squared jump distance, are no longer applicable. While stochastic gradient-based end-to-end optimization has been explored for tuning SMC samplers, they often incur excessive training costs, even for tuning just the kernel step sizes. In this work, we propose a general adaptation framework for tuning the Markov kernels in SMC samplers by minimizing the incremental Kullback-Leibler (KL) divergence between the proposal and target paths. For step size tuning, we provide a gradient- and tuning-free algorithm that is generally applicable for kernels such as Langevin Monte Carlo (LMC). We further demonstrate the utility of our approach by providing a tailored scheme for tuning kinetic LMC used in SMC samplers. Our implementations are able to obtain a full schedule of tuned parameters at the cost of a few vanilla SMC runs, which is a fraction of gradient-based approaches.
