Table of Contents
Fetching ...

Constrained Sampling with Primal-Dual Langevin Monte Carlo

Luiz F. O. Chamon, Mohammad Reza Karimi, Anna Korba

TL;DR

This paper tackles sampling from a target distribution π while enforcing distribution-level constraints on moments and other statistics. It introduces PD-LMC, a discrete-time primal-dual Langevin Monte Carlo method operating in Wasserstein space to jointly optimize over the sample distribution μ and dual variables (λ,ν). Under (strong) convexity and log-Sobolev inequalities, the authors establish sublinear convergence of the algorithm in KL divergence and Wasserstein distance, respectively, and extend results to LSIs with a two-timescale DLMC variant. Through experiments on constrained Gaussian sampling, fairness in Bayesian inference, and counterfactual stock-market analysis, PD-LMC demonstrates effective constraint satisfaction with limited excursions outside feasible regions and provides interpretable dual variables that quantify constraint sensitivity and counterfactual impact.

Abstract

This work considers the problem of sampling from a probability distribution known up to a normalization constant while satisfying a set of statistical constraints specified by the expected values of general nonlinear functions. This problem finds applications in, e.g., Bayesian inference, where it can constrain moments to evaluate counterfactual scenarios or enforce desiderata such as prediction fairness. Methods developed to handle support constraints, such as those based on mirror maps, barriers, and penalties, are not suited for this task. This work therefore relies on gradient descent-ascent dynamics in Wasserstein space to put forward a discrete-time primal-dual Langevin Monte Carlo algorithm (PD-LMC) that simultaneously constrains the target distribution and samples from it. We analyze the convergence of PD-LMC under standard assumptions on the target distribution and constraints, namely (strong) convexity and log-Sobolev inequalities. To do so, we bring classical optimization arguments for saddle-point algorithms to the geometry of Wasserstein space. We illustrate the relevance and effectiveness of PD-LMC in several applications.

Constrained Sampling with Primal-Dual Langevin Monte Carlo

TL;DR

This paper tackles sampling from a target distribution π while enforcing distribution-level constraints on moments and other statistics. It introduces PD-LMC, a discrete-time primal-dual Langevin Monte Carlo method operating in Wasserstein space to jointly optimize over the sample distribution μ and dual variables (λ,ν). Under (strong) convexity and log-Sobolev inequalities, the authors establish sublinear convergence of the algorithm in KL divergence and Wasserstein distance, respectively, and extend results to LSIs with a two-timescale DLMC variant. Through experiments on constrained Gaussian sampling, fairness in Bayesian inference, and counterfactual stock-market analysis, PD-LMC demonstrates effective constraint satisfaction with limited excursions outside feasible regions and provides interpretable dual variables that quantify constraint sensitivity and counterfactual impact.

Abstract

This work considers the problem of sampling from a probability distribution known up to a normalization constant while satisfying a set of statistical constraints specified by the expected values of general nonlinear functions. This problem finds applications in, e.g., Bayesian inference, where it can constrain moments to evaluate counterfactual scenarios or enforce desiderata such as prediction fairness. Methods developed to handle support constraints, such as those based on mirror maps, barriers, and penalties, are not suited for this task. This work therefore relies on gradient descent-ascent dynamics in Wasserstein space to put forward a discrete-time primal-dual Langevin Monte Carlo algorithm (PD-LMC) that simultaneously constrains the target distribution and samples from it. We analyze the convergence of PD-LMC under standard assumptions on the target distribution and constraints, namely (strong) convexity and log-Sobolev inequalities. To do so, we bring classical optimization arguments for saddle-point algorithms to the geometry of Wasserstein space. We illustrate the relevance and effectiveness of PD-LMC in several applications.

Paper Structure

This paper contains 21 sections, 10 theorems, 100 equations, 14 figures, 3 tables, 2 algorithms.

Key Result

Proposition 2.2

Under Assumption A:strict_feasibility, the following holds:

Figures (14)

  • Figure 1: Sampling from a 1D truncated Gaussian (ground truth displayed as dashed lines).
  • Figure 2: Sampling from a 2D truncated Gaussian (true mean in red and sample mean in orange).
  • Figure 3: Distribution of the probability of predicting $>\,$50k under the Bayesian logistic model (black lines indicate the mean across genders).
  • Figure 4: Counterfactual sampling of the stock market: dual variables.
  • Figure 5: One-dimensional truncated Gaussian sampling: (a) Ergodic average of the constraint function (slack) and (b) Evolution of the dual variable $\lambda$.
  • ...and 9 more figures

Theorems & Definitions (20)

  • Proposition 2.2
  • proof
  • Theorem 3.3
  • Proposition 3.4
  • Theorem 3.6
  • proof : Proof of Theorem \ref{['T:main_convex']}
  • Lemma C.1
  • Lemma C.2
  • proof : Proof of Lemma \ref{['T:lyapunov']}
  • proof : Proof of Lemma \ref{['T:R_0']}
  • ...and 10 more