Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm
Qiang Liu, Dilin Wang
TL;DR
This work introduces Stein Variational Gradient Descent (SVGD), a general-purpose, particle-based variational inference method that deterministically transports a set of particles toward a target posterior by performing functional gradient descent in an RKHS to minimize the KL divergence. It leverages kernelized Stein discrepancy to derive a closed-form, steepest-descent direction for updates, yielding updates that combine a smoothed gradient toward the posterior with a repulsive interaction to maintain particle diversity. The main contributions include a rigorous link between KL derivatives under smooth transforms and Stein’s identity, a practical SVGD algorithm that reduces to MAP with a single particle and scales to multi-particle Bayesian inference, and extensive experiments showing competitive performance on toy and real data, with favorable efficiency characteristics. The method offers a general, user-friendly variational tool capable of unnormalized-posterior inference and scales to large datasets via minibatching and kernel tricks, bridging optimization-style updates with probabilistic inference.
Abstract
We propose a general purpose variational inference algorithm that forms a natural counterpart of gradient descent for optimization. Our method iteratively transports a set of particles to match the target distribution, by applying a form of functional gradient descent that minimizes the KL divergence. Empirical studies are performed on various real world models and datasets, on which our method is competitive with existing state-of-the-art methods. The derivation of our method is based on a new theoretical result that connects the derivative of KL divergence under smooth transforms with Stein's identity and a recently proposed kernelized Stein discrepancy, which is of independent interest.
