Table of Contents
Fetching ...

Faster Sampling from Log-Concave Densities over Polytopes via Efficient Linear Solvers

Oren Mangoubi, Nisheeth K. Vishnoi

TL;DR

The key technical ingredients are to show that the matrices that arise in this Dikin walk change slowly, to deploy efficient linear solvers that can leverage this slow change to speed up matrix inversion by using information computed in previous steps, and to speed up the computation of the determinantal term in the Metropolis filter step via a randomized Taylor series-based estimator.

Abstract

We consider the problem of sampling from a log-concave distribution $π(θ) \propto e^{-f(θ)}$ constrained to a polytope $K:=\{θ\in \mathbb{R}^d: Aθ\leq b\}$, where $A\in \mathbb{R}^{m\times d}$ and $b \in \mathbb{R}^m$.The fastest-known algorithm \cite{mangoubi2022faster} for the setting when $f$ is $O(1)$-Lipschitz or $O(1)$-smooth runs in roughly $O(md \times md^{ω-1})$ arithmetic operations, where the $md^{ω-1}$ term arises because each Markov chain step requires computing a matrix inversion and determinant (here $ω\approx 2.37$ is the matrix multiplication constant). We present a nearly-optimal implementation of this Markov chain with per-step complexity which is roughly the number of non-zero entries of $A$ while the number of Markov chain steps remains the same. The key technical ingredients are 1) to show that the matrices that arise in this Dikin walk change slowly, 2) to deploy efficient linear solvers that can leverage this slow change to speed up matrix inversion by using information computed in previous steps, and 3) to speed up the computation of the determinantal term in the Metropolis filter step via a randomized Taylor series-based estimator.

Faster Sampling from Log-Concave Densities over Polytopes via Efficient Linear Solvers

TL;DR

The key technical ingredients are to show that the matrices that arise in this Dikin walk change slowly, to deploy efficient linear solvers that can leverage this slow change to speed up matrix inversion by using information computed in previous steps, and to speed up the computation of the determinantal term in the Metropolis filter step via a randomized Taylor series-based estimator.

Abstract

We consider the problem of sampling from a log-concave distribution constrained to a polytope , where and .The fastest-known algorithm \cite{mangoubi2022faster} for the setting when is -Lipschitz or -smooth runs in roughly arithmetic operations, where the term arises because each Markov chain step requires computing a matrix inversion and determinant (here is the matrix multiplication constant). We present a nearly-optimal implementation of this Markov chain with per-step complexity which is roughly the number of non-zero entries of while the number of Markov chain steps remains the same. The key technical ingredients are 1) to show that the matrices that arise in this Dikin walk change slowly, 2) to deploy efficient linear solvers that can leverage this slow change to speed up matrix inversion by using information computed in previous steps, and 3) to speed up the computation of the determinantal term in the Metropolis filter step via a randomized Taylor series-based estimator.
Paper Structure (18 sections, 12 theorems, 25 equations, 2 algorithms)

This paper contains 18 sections, 12 theorems, 25 equations, 2 algorithms.

Key Result

Theorem 2.1

There exists an algorithm (Algorithm alg_Soft_Dikin_Walk_sparse) which, given the following inputs, 1) $\delta, R>0$ and either $L>0$ or $\beta>0$, 2) $A \in \mathbb{R}^{m \times d}$, $b\in \mathbb{R}^m$ that define a polytope $K := \{\theta \in \mathbb{R}^d : A \theta \leq b\}$ such that $K$ is con where $T_f$ is the number of arithmetic operations to evaluate $f$.

Theorems & Definitions (12)

  • Theorem 2.1: Main result
  • Lemma 5.1: Efficient inverse maintenance, Theorem 13 in lee2015efficient,
  • Lemma 5.2: Lemmas 1.2 & 1.5 in laddha2020strong
  • Lemma 5.3: Lemma 6.9 in mangoubi2022faster
  • Lemma 5.4: Lemma 6.7 in mangoubi2022faster
  • Lemma 5.5: Lemma 6.15 of mangoubi2022faster
  • Lemma 5.6: von Neumann trace Inequality von1962some
  • Proposition 5.7
  • Lemma 5.8: Hanson-Wright concentration inequality
  • Lemma 6.1
  • ...and 2 more