Faster Sampling from Log-Concave Densities over Polytopes via Efficient Linear Solvers

Oren Mangoubi; Nisheeth K. Vishnoi

Faster Sampling from Log-Concave Densities over Polytopes via Efficient Linear Solvers

Oren Mangoubi, Nisheeth K. Vishnoi

TL;DR

The key technical ingredients are to show that the matrices that arise in this Dikin walk change slowly, to deploy efficient linear solvers that can leverage this slow change to speed up matrix inversion by using information computed in previous steps, and to speed up the computation of the determinantal term in the Metropolis filter step via a randomized Taylor series-based estimator.

Abstract

We consider the problem of sampling from a log-concave distribution $π(θ) \propto e^{-f(θ)}$ constrained to a polytope $K:=\{θ\in \mathbb{R}^d: Aθ\leq b\}$, where $A\in \mathbb{R}^{m\times d}$ and $b \in \mathbb{R}^m$.The fastest-known algorithm \cite{mangoubi2022faster} for the setting when $f$ is $O(1)$-Lipschitz or $O(1)$-smooth runs in roughly $O(md \times md^{ω-1})$ arithmetic operations, where the $md^{ω-1}$ term arises because each Markov chain step requires computing a matrix inversion and determinant (here $ω\approx 2.37$ is the matrix multiplication constant). We present a nearly-optimal implementation of this Markov chain with per-step complexity which is roughly the number of non-zero entries of $A$ while the number of Markov chain steps remains the same. The key technical ingredients are 1) to show that the matrices that arise in this Dikin walk change slowly, 2) to deploy efficient linear solvers that can leverage this slow change to speed up matrix inversion by using information computed in previous steps, and 3) to speed up the computation of the determinantal term in the Metropolis filter step via a randomized Taylor series-based estimator.

Faster Sampling from Log-Concave Densities over Polytopes via Efficient Linear Solvers

TL;DR

Abstract

We consider the problem of sampling from a log-concave distribution

constrained to a polytope

, where

and

.The fastest-known algorithm \cite{mangoubi2022faster} for the setting when

-Lipschitz or

-smooth runs in roughly

arithmetic operations, where the

term arises because each Markov chain step requires computing a matrix inversion and determinant (here

is the matrix multiplication constant). We present a nearly-optimal implementation of this Markov chain with per-step complexity which is roughly the number of non-zero entries of

while the number of Markov chain steps remains the same. The key technical ingredients are 1) to show that the matrices that arise in this Dikin walk change slowly, 2) to deploy efficient linear solvers that can leverage this slow change to speed up matrix inversion by using information computed in previous steps, and 3) to speed up the computation of the determinantal term in the Metropolis filter step via a randomized Taylor series-based estimator.

Paper Structure (18 sections, 12 theorems, 25 equations, 2 algorithms)

This paper contains 18 sections, 12 theorems, 25 equations, 2 algorithms.

Introduction
Main result
Algorithm
Overview of proof of main result
Dikin walk in the special case $f\equiv 0$.
Dikin walks for sampling from Lipschitz/smooth log-concave distributions.
Faster implementation of matrix operations when $f\equiv 0$.
Our work.
Bounding the change in the soft-threshold log-barrier Hessian.
Computing a randomized estimate for the determinantal term.
Bounding the number of arithmetic operations.
Bounding the total variation distance.
Preliminaries needed for the proof
Proof of the main result
Conclusion
...and 3 more sections

Key Result

Theorem 2.1

There exists an algorithm (Algorithm alg_Soft_Dikin_Walk_sparse) which, given the following inputs, 1) $\delta, R>0$ and either $L>0$ or $\beta>0$, 2) $A \in \mathbb{R}^{m \times d}$, $b\in \mathbb{R}^m$ that define a polytope $K := \{\theta \in \mathbb{R}^d : A \theta \leq b\}$ such that $K$ is con where $T_f$ is the number of arithmetic operations to evaluate $f$.

Theorems & Definitions (12)

Theorem 2.1: Main result
Lemma 5.1: Efficient inverse maintenance, Theorem 13 in lee2015efficient,
Lemma 5.2: Lemmas 1.2 & 1.5 in laddha2020strong
Lemma 5.3: Lemma 6.9 in mangoubi2022faster
Lemma 5.4: Lemma 6.7 in mangoubi2022faster
Lemma 5.5: Lemma 6.15 of mangoubi2022faster
Lemma 5.6: von Neumann trace Inequality von1962some
Proposition 5.7
Lemma 5.8: Hanson-Wright concentration inequality
Lemma 6.1
...and 2 more

Faster Sampling from Log-Concave Densities over Polytopes via Efficient Linear Solvers

TL;DR

Abstract

Faster Sampling from Log-Concave Densities over Polytopes via Efficient Linear Solvers

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (12)