Table of Contents
Fetching ...

Accelerating Constrained Sampling: A Large Deviations Approach

Yingli Wang, Changwei Tu, Xiaoyu Wang, Lingjiong Zhu

Abstract

The problem of sampling a target probability distribution on a constrained domain arises in many applications including machine learning. For constrained sampling, various Langevin algorithms such as projected Langevin Monte Carlo (PLMC), based on the discretization of reflected Langevin dynamics (RLD) and more generally skew-reflected non-reversible Langevin Monte Carlo (SRNLMC), based on the discretization of skew-reflected non-reversible Langevin dynamics (SRNLD), have been proposed and studied in the literature. This work focuses on the long-time behavior of SRNLD, where a skew-symmetric matrix is added to RLD. Although acceleration for SRNLD has been studied, it is not clear how one should design the skew-symmetric matrix in the dynamics to achieve good performance in practice. We establish a large deviation principle (LDP) for the empirical measure of SRNLD when the skew-symmetric matrix is chosen such that its product with the outward unit normal vector field on the boundary is zero. By explicitly characterizing the rate functions, we show that this choice of the skew-symmetric matrix accelerates the convergence to the target distribution compared to RLD and reduces the asymptotic variance. Numerical experiments for SRNLMC based on the proposed skew-symmetric matrix show superior performance, which validate the theoretical findings from the large deviations theory.

Accelerating Constrained Sampling: A Large Deviations Approach

Abstract

The problem of sampling a target probability distribution on a constrained domain arises in many applications including machine learning. For constrained sampling, various Langevin algorithms such as projected Langevin Monte Carlo (PLMC), based on the discretization of reflected Langevin dynamics (RLD) and more generally skew-reflected non-reversible Langevin Monte Carlo (SRNLMC), based on the discretization of skew-reflected non-reversible Langevin dynamics (SRNLD), have been proposed and studied in the literature. This work focuses on the long-time behavior of SRNLD, where a skew-symmetric matrix is added to RLD. Although acceleration for SRNLD has been studied, it is not clear how one should design the skew-symmetric matrix in the dynamics to achieve good performance in practice. We establish a large deviation principle (LDP) for the empirical measure of SRNLD when the skew-symmetric matrix is chosen such that its product with the outward unit normal vector field on the boundary is zero. By explicitly characterizing the rate functions, we show that this choice of the skew-symmetric matrix accelerates the convergence to the target distribution compared to RLD and reduces the asymptotic variance. Numerical experiments for SRNLMC based on the proposed skew-symmetric matrix show superior performance, which validate the theoretical findings from the large deviations theory.

Paper Structure

This paper contains 46 sections, 11 theorems, 150 equations, 15 figures.

Key Result

Lemma 1

Under Assumptions assump:regularity_local, assump:boundary and assump:nablaJ, the process $(X_t)_{t\ge0}$ defined by Eq. eq:sde_local with Neumann boundary conditions has a unique invariant probability measure $\mu \in \mathcal{P}(K)$. This measure has a density with respect to Lebesgue measure $dx$

Figures (15)

  • Figure 1: Geometric interpretation of the dynamics. (a) At a boundary point $x_\tau\in\partial K$, standard reflection pushes along the normal direction $\mathbf n$, whereas skew-reflection pushes along $\mathbf n^J$; when $J(x)\mathbf n(x)=0$ on $\partial K$, $\mathbf n^J=\mathbf n$ on $\partial K$. Otherwise, $\mathbf n^J$ generally introduces an additional tangential component (red) induced by the skew field. (b) Starting from $x_k$, the algorithm proposes an unconstrained step to $\tilde{x}_{k+1}$ and then maps it back to $K$ via either the Euclidean projection $\mathcal{P}_K$ (solid) or an oblique correction in the direction $-\mathbf n^J$ leading to the skew-projection $\mathcal{P}^J_K$ (dashed).
  • Figure 2: Visualized density plots for the first 2 dimensions with a centered ball constraint.
  • Figure 3: $\mathcal{W}_1$ distance in each dimension of PLMC and SRNLMC with a centered ball constraint.
  • Figure 4: Visualized density plots for the first 2 dimensions with a smoothed $\ell_p$ ball constraint with a sublevel-set.
  • Figure 5: $\mathcal{W}_1$ distance in each dimension of PLMC and SRNLMC with a smoothed $\ell_p$ ball constraint with a sublevel-set.
  • ...and 10 more figures

Theorems & Definitions (26)

  • Lemma 1: Existence and Uniqueness of Invariant Measure
  • Definition 1: Scaled Cumulant Generating Function
  • Lemma 2: Existence of $\lambda(g)$ and a principal eigenfunction
  • proof
  • Lemma 3
  • proof
  • Lemma 4
  • proof
  • Lemma 5: Exponential Tightness
  • proof
  • ...and 16 more