Provable Accuracy Bounds for Hybrid Dynamical Optimization and Sampling

Matthew X. Burns; Qingyuan Hou; Michael C. Huang

Provable Accuracy Bounds for Hybrid Dynamical Optimization and Sampling

Matthew X. Burns, Qingyuan Hou, Michael C. Huang

TL;DR

This work provides the first non-asymptotic probabilistic guarantees for hybrid LNLS using block Langevin diffusion, establishing exponential KL convergence for ideal DXs under a log-Sobolev inequality and quantifying a finite-variation induced bias in the 2-Wasserstein metric. It introduces Randomized and Cyclic Block Langevin Diffusion with explicit constants, and derives non-asymptotic W2 bounds that connect device variation, step size, and noise to performance. Numerical experiments on Gaussian targets illustrate the theory, showing how hyperparameters and analog non-idealities shape convergence and bias, and revealing an equivalence in convergence behavior between randomized and cyclic block strategies. The results offer principled design guidance for hybrid DX systems, enabling principled hyperparameter tuning and a clearer understanding of how analog imperfections affect accuracy in sampling and optimization tasks.

Abstract

Analog dynamical accelerators (DXs) are a growing sub-field in computer architecture research, offering order-of-magnitude gains in power efficiency and latency over traditional digital methods in several machine learning, optimization, and sampling tasks. However, limited-capacity accelerators require hybrid analog/digital algorithms to solve real-world problems, commonly using large-neighborhood local search (LNLS) frameworks. Unlike fully digital algorithms, hybrid LNLS has no non-asymptotic convergence guarantees and no principled hyperparameter selection schemes, particularly limiting cross-device training and inference. In this work, we provide non-asymptotic convergence guarantees for hybrid LNLS by reducing to block Langevin Diffusion (BLD) algorithms. Adapting tools from classical sampling theory, we prove exponential KL-divergence convergence for randomized and cyclic block selection strategies using ideal DXs. With finite device variation, we provide explicit bounds on the 2-Wasserstein bias in terms of step duration, noise strength, and function parameters. Our BLD model provides a key link between established theory and novel computing platforms, and our theoretical results provide a closed-form expression linking device variation, algorithm hyperparameters, and performance.

Provable Accuracy Bounds for Hybrid Dynamical Optimization and Sampling

TL;DR

Abstract

Paper Structure (31 sections, 18 theorems, 144 equations, 3 figures, 5 algorithms)

This paper contains 31 sections, 18 theorems, 144 equations, 3 figures, 5 algorithms.

Introduction
Background
Related Works
Dynamical Accelerators
Langevin Diffusion
Main Results
LNLS as Block Sampling
Randomized Block Langevin Diffusion
Cyclic Block Langevin Diffusion
Finite Variation
Numerical Experiments
Conclusion
Findings:
Limitations:
Directions for Future Work:
...and 16 more sections

Key Result

Theorem 0

Suppose $\pi_\beta(x) \propto e^{-\beta f(x)}$ satisfies an LSI with constant $1/\gamma$. Then the distribution $\mu_t$ of the Langevin diffusion at time $t$ satisfies

Figures (3)

Figure 1: Illustration of the LNLS algorithm on a 3-block, 9-variable problem. [Left] An illustration of the variable sample paths during algorithm execution. When a block is not being actively evolved, the constituent variables remain fixed (gray). [Right] Logical partition of variables in an LNLS framework, where one block is being actively evolved by the DX with the others resident in digital memory. The digital host performs the control operations needed to read the block state, write back to memory, and begin the next block evolution.
Figure 2: Convergence in $\operatorname{D}_{\mathrm{KL}}$ for varying block counts $b$ with for $b-$BCLD and $b-$RCLD (a) versus simulated time and (b) versus cycles $kb$. The inset plot in (a) shows the absolute difference between the RBLD and CBLD $\operatorname{D}_{\mathrm{KL}}$ values averaged over each block count.
Figure 3: BCLD $\operatorname{D}_{\mathrm{KL}}$ convergence (a) versus whole-problem cycles with varying block duration $\lambda$ and (b) versus simulated time with varying multiplicative Gaussian perturbations.

Theorems & Definitions (27)

Theorem 0: LD Convergence (Theorem 1 of vempala_rapid_2019)
Theorem 1: RBLD $\operatorname{D_{\mathrm{KL}}}({\mu_{t_k}}\Vert{\pi_\beta})$ Convergence
Lemma 1: Cyclic KL Contraction
Theorem 2: CBLD $\operatorname{D_{\mathrm{KL}}}({\mu(t_k)}\Vert{\pi_\beta})$ Convergence
Lemma 2: Finite Variation Block Langevin $W_2$ Distance
Theorem 3: Finite-Variation BLD $W_2$ Convergence
Lemma 3
proof
Lemma 4: Lemma 16 of chewi_analysis_2021
Definition 1: Alternative LSI
...and 17 more

Provable Accuracy Bounds for Hybrid Dynamical Optimization and Sampling

TL;DR

Abstract

Provable Accuracy Bounds for Hybrid Dynamical Optimization and Sampling

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (27)