Importance Sampling via Score-based Generative Models
Heasung Kim, Taekyun Lee, Hyeji Kim, Gustavo de Veciana
TL;DR
This work introduces a training-free importance sampling framework that leverages a pretrained score-based diffusion model for a base distribution $p(\mathbf{x})$ to sample from $q(\mathbf{x}) \propto l(\mathbf{x}) p(\mathbf{x})$ for arbitrary differentiable weights $l(\mathbf{x})$. The core idea replaces the intractable $\nabla_\mathbf{x} \log q_t(\mathbf{x})$ with a tractable approximation that combines $\nabla_\mathbf{x} \log p_t(\mathbf{x})$ and a Tweedie-based correction involving $l(\mathbf{x})$, enabling a backward SDE that generates importance samples without retraining. The authors prove an upper bound on the score-gap between the true and approximated distributions that vanishes as time $t$ approaches zero under mild smoothness assumptions. Empirically, ISSGM achieves competitive or superior performance on synthetic manifolds and real-image tasks, including neural classifier weighting and high-frequency sampling, while maintaining the training-free property. The approach promises practical benefits for adaptive sampling, bias mitigation, and model interpretation by enabling flexible, data-efficient importance sampling with a single base distribution.
Abstract
Importance sampling, which involves sampling from a probability density function (PDF) proportional to the product of an importance weight function and a base PDF, is a powerful technique with applications in variance reduction, biased or customized sampling, data augmentation, and beyond. Inspired by the growing availability of score-based generative models (SGMs), we propose an entirely training-free Importance sampling framework that relies solely on an SGM for the base PDF. Our key innovation is realizing the importance sampling process as a backward diffusion process, expressed in terms of the score function of the base PDF and the specified importance weight function--both readily available--eliminating the need for any additional training. We conduct a thorough analysis demonstrating the method's scalability and effectiveness across diverse datasets and tasks, including importance sampling for industrial and natural images with neural importance weight functions. The training-free aspect of our method is particularly compelling in real-world scenarios where a single base distribution underlies multiple biased sampling tasks, each requiring a different importance weight function. To the best of our knowledge our approach is the first importance sampling framework to achieve this.
