Table of Contents
Fetching ...

Functional Gradient Flows for Constrained Sampling

Shiyue Zhang, Longlin Yu, Ziheng Cheng, Cheng Zhang

TL;DR

This paper offers a general solution to constrained sampling by introducing a boundary condition for the gradient flow which would confine the particles within the specific domain, and proposes a new functional gradient ParVI method for constrained sampling, called constrained functional gradient flow (CFG).

Abstract

Recently, through a unified gradient flow perspective of Markov chain Monte Carlo (MCMC) and variational inference (VI), particle-based variational inference methods (ParVIs) have been proposed that tend to combine the best of both worlds. While typical ParVIs such as Stein Variational Gradient Descent (SVGD) approximate the gradient flow within a reproducing kernel Hilbert space (RKHS), many attempts have been made recently to replace RKHS with more expressive function spaces, such as neural networks. While successful, these methods are mainly designed for sampling from unconstrained domains. In this paper, we offer a general solution to constrained sampling by introducing a boundary condition for the gradient flow which would confine the particles within the specific domain. This allows us to propose a new functional gradient ParVI method for constrained sampling, called constrained functional gradient flow (CFG), with provable continuous-time convergence in total variation (TV). We also present novel numerical strategies to handle the boundary integral term arising from the domain constraints. Our theory and experiments demonstrate the effectiveness of the proposed framework.

Functional Gradient Flows for Constrained Sampling

TL;DR

This paper offers a general solution to constrained sampling by introducing a boundary condition for the gradient flow which would confine the particles within the specific domain, and proposes a new functional gradient ParVI method for constrained sampling, called constrained functional gradient flow (CFG).

Abstract

Recently, through a unified gradient flow perspective of Markov chain Monte Carlo (MCMC) and variational inference (VI), particle-based variational inference methods (ParVIs) have been proposed that tend to combine the best of both worlds. While typical ParVIs such as Stein Variational Gradient Descent (SVGD) approximate the gradient flow within a reproducing kernel Hilbert space (RKHS), many attempts have been made recently to replace RKHS with more expressive function spaces, such as neural networks. While successful, these methods are mainly designed for sampling from unconstrained domains. In this paper, we offer a general solution to constrained sampling by introducing a boundary condition for the gradient flow which would confine the particles within the specific domain. This allows us to propose a new functional gradient ParVI method for constrained sampling, called constrained functional gradient flow (CFG), with provable continuous-time convergence in total variation (TV). We also present novel numerical strategies to handle the boundary integral term arising from the domain constraints. Our theory and experiments demonstrate the effectiveness of the proposed framework.

Paper Structure

This paper contains 47 sections, 8 theorems, 37 equations, 10 figures, 8 tables, 1 algorithm.

Key Result

Proposition 4.1

If $v_t\cdot \vec{{\bm{n}}}\le 0$ on $\partial\Omega$, then $p_t(\Omega)$ will not decrease.

Figures (10)

  • Figure 1: Left: CFG sampled particles at different numbers of iterations on constrained domains (ring, cardioid, double-moon, block). Right: The convergence curves of MSVGD, CFG and MIED on the block constraint.
  • Figure 2: Left: Wasserstein-2 distance of SPH, CFG and MIED versus the number of particles, Right: Energy distance of SPH, CFG and MIED versus the number of particles. Both on a synthetic dataset.
  • Figure 3: Left: Bayesian Lasso ($q=1$) using Spherical HMC (upper left), CFG (upper middle) and MIED (upper right). Bayesian Bridge Regression ($q=1.2$) using Spherical HMC (lower left) CFG (upper middle) and MIED (upper right). Right: Results of monotonic Bayesian neural network with $\epsilon=0.01$. Only the portion below $0.02$ is shown on the y-axis to better display the performance of models satisfying constraint.
  • Figure 4: Left: MSE of boundary integral estimation of distribution $p_1$. Middle: MSE of boundary integral estimation of distribution $p_2$. Right: MSE of boundary integral estimation of distribution $p_3$.
  • Figure 5: MSE of boundary integral estimation of distribution $p_1$ and velocity $v_1$ using fixed edgewidths and adaptive edgewidth.
  • ...and 5 more figures

Theorems & Definitions (13)

  • Proposition 4.1
  • Example 4.2
  • Theorem 5.2
  • Proposition 5.5
  • Theorem 5.6
  • Proposition A.1
  • proof
  • Theorem B.1
  • proof
  • Proposition B.2
  • ...and 3 more