Table of Contents
Fetching ...

Provable Accuracy Bounds for Hybrid Dynamical Optimization and Sampling

Matthew X. Burns, Qingyuan Hou, Michael C. Huang

TL;DR

This work provides the first non-asymptotic probabilistic guarantees for hybrid LNLS using block Langevin diffusion, establishing exponential KL convergence for ideal DXs under a log-Sobolev inequality and quantifying a finite-variation induced bias in the 2-Wasserstein metric. It introduces Randomized and Cyclic Block Langevin Diffusion with explicit constants, and derives non-asymptotic W2 bounds that connect device variation, step size, and noise to performance. Numerical experiments on Gaussian targets illustrate the theory, showing how hyperparameters and analog non-idealities shape convergence and bias, and revealing an equivalence in convergence behavior between randomized and cyclic block strategies. The results offer principled design guidance for hybrid DX systems, enabling principled hyperparameter tuning and a clearer understanding of how analog imperfections affect accuracy in sampling and optimization tasks.

Abstract

Analog dynamical accelerators (DXs) are a growing sub-field in computer architecture research, offering order-of-magnitude gains in power efficiency and latency over traditional digital methods in several machine learning, optimization, and sampling tasks. However, limited-capacity accelerators require hybrid analog/digital algorithms to solve real-world problems, commonly using large-neighborhood local search (LNLS) frameworks. Unlike fully digital algorithms, hybrid LNLS has no non-asymptotic convergence guarantees and no principled hyperparameter selection schemes, particularly limiting cross-device training and inference. In this work, we provide non-asymptotic convergence guarantees for hybrid LNLS by reducing to block Langevin Diffusion (BLD) algorithms. Adapting tools from classical sampling theory, we prove exponential KL-divergence convergence for randomized and cyclic block selection strategies using ideal DXs. With finite device variation, we provide explicit bounds on the 2-Wasserstein bias in terms of step duration, noise strength, and function parameters. Our BLD model provides a key link between established theory and novel computing platforms, and our theoretical results provide a closed-form expression linking device variation, algorithm hyperparameters, and performance.

Provable Accuracy Bounds for Hybrid Dynamical Optimization and Sampling

TL;DR

This work provides the first non-asymptotic probabilistic guarantees for hybrid LNLS using block Langevin diffusion, establishing exponential KL convergence for ideal DXs under a log-Sobolev inequality and quantifying a finite-variation induced bias in the 2-Wasserstein metric. It introduces Randomized and Cyclic Block Langevin Diffusion with explicit constants, and derives non-asymptotic W2 bounds that connect device variation, step size, and noise to performance. Numerical experiments on Gaussian targets illustrate the theory, showing how hyperparameters and analog non-idealities shape convergence and bias, and revealing an equivalence in convergence behavior between randomized and cyclic block strategies. The results offer principled design guidance for hybrid DX systems, enabling principled hyperparameter tuning and a clearer understanding of how analog imperfections affect accuracy in sampling and optimization tasks.

Abstract

Analog dynamical accelerators (DXs) are a growing sub-field in computer architecture research, offering order-of-magnitude gains in power efficiency and latency over traditional digital methods in several machine learning, optimization, and sampling tasks. However, limited-capacity accelerators require hybrid analog/digital algorithms to solve real-world problems, commonly using large-neighborhood local search (LNLS) frameworks. Unlike fully digital algorithms, hybrid LNLS has no non-asymptotic convergence guarantees and no principled hyperparameter selection schemes, particularly limiting cross-device training and inference. In this work, we provide non-asymptotic convergence guarantees for hybrid LNLS by reducing to block Langevin Diffusion (BLD) algorithms. Adapting tools from classical sampling theory, we prove exponential KL-divergence convergence for randomized and cyclic block selection strategies using ideal DXs. With finite device variation, we provide explicit bounds on the 2-Wasserstein bias in terms of step duration, noise strength, and function parameters. Our BLD model provides a key link between established theory and novel computing platforms, and our theoretical results provide a closed-form expression linking device variation, algorithm hyperparameters, and performance.
Paper Structure (31 sections, 18 theorems, 144 equations, 3 figures, 5 algorithms)

This paper contains 31 sections, 18 theorems, 144 equations, 3 figures, 5 algorithms.

Key Result

Theorem 0

Suppose $\pi_\beta(x) \propto e^{-\beta f(x)}$ satisfies an LSI with constant $1/\gamma$. Then the distribution $\mu_t$ of the Langevin diffusion at time $t$ satisfies

Figures (3)

  • Figure 1: Illustration of the LNLS algorithm on a 3-block, 9-variable problem. [Left] An illustration of the variable sample paths during algorithm execution. When a block is not being actively evolved, the constituent variables remain fixed (gray). [Right] Logical partition of variables in an LNLS framework, where one block is being actively evolved by the DX with the others resident in digital memory. The digital host performs the control operations needed to read the block state, write back to memory, and begin the next block evolution.
  • Figure 2: Convergence in $\operatorname{D}_{\mathrm{KL}}$ for varying block counts $b$ with for $b-$BCLD and $b-$RCLD (a) versus simulated time and (b) versus cycles $kb$. The inset plot in (a) shows the absolute difference between the RBLD and CBLD $\operatorname{D}_{\mathrm{KL}}$ values averaged over each block count.
  • Figure 3: BCLD $\operatorname{D}_{\mathrm{KL}}$ convergence (a) versus whole-problem cycles with varying block duration $\lambda$ and (b) versus simulated time with varying multiplicative Gaussian perturbations.

Theorems & Definitions (27)

  • Theorem 0: LD Convergence (Theorem 1 of vempala_rapid_2019)
  • Theorem 1: RBLD $\operatorname{D_{\mathrm{KL}}}({\mu_{t_k}}\Vert{\pi_\beta})$ Convergence
  • Lemma 1: Cyclic KL Contraction
  • Theorem 2: CBLD $\operatorname{D_{\mathrm{KL}}}({\mu(t_k)}\Vert{\pi_\beta})$ Convergence
  • Lemma 2: Finite Variation Block Langevin $W_2$ Distance
  • Theorem 3: Finite-Variation BLD $W_2$ Convergence
  • Lemma 3
  • proof
  • Lemma 4: Lemma 16 of chewi_analysis_2021
  • Definition 1: Alternative LSI
  • ...and 17 more