Table of Contents
Fetching ...

Linear Operator Approximate Message Passing (OpAMP)

Riccardo Rossetti, Bobak Nazer, Galen Reeves

TL;DR

The paper develops Linear Operator AMP (OpAMP), a rigorous AMP framework for dynamic inference with time-varying linear operators and autoregressive memory, enabling precise Gaussian-approximation analysis via state evolution for high-dimensional problems. It introduces a decomposition for general linear operators, derives debiasing rules, and analyzes Projection AMP as a special, tractable case with simplified SE. A key case study on power iteration with partial updates in a spiked matrix model demonstrates how partial data access can be analyzed and accelerated, with SE accurately predicting performance under full, round-robin, and random update schemes. Numerical experiments corroborate the theory, showing that partial-update schedules can be more efficient than full updates in terms of computational effort while achieving the same fixed points, informing distributed AMP design and scheduling strategies.

Abstract

This paper introduces a framework for approximate message passing (AMP) in dynamic settings where the data at each iteration is passed through a linear operator. This framework is motivated in part by applications in large-scale, distributed computing where only a subset of the data is available at each iteration. An autoregressive memory term is used to mitigate information loss across iterations and a specialized algorithm, called projection AMP, is designed for the case where each linear operator is an orthogonal projection. Precise theoretical guarantees are provided for a class of Gaussian matrices and non-separable denoising functions. Specifically, it is shown that the iterates can be well-approximated in the high-dimensional limit by a Gaussian process whose second-order statistics are defined recursively via state evolution. These results are applied to the problem of estimating a rank-one spike corrupted by additive Gaussian noise using partial row updates, and the theory is validated by numerical simulations.

Linear Operator Approximate Message Passing (OpAMP)

TL;DR

The paper develops Linear Operator AMP (OpAMP), a rigorous AMP framework for dynamic inference with time-varying linear operators and autoregressive memory, enabling precise Gaussian-approximation analysis via state evolution for high-dimensional problems. It introduces a decomposition for general linear operators, derives debiasing rules, and analyzes Projection AMP as a special, tractable case with simplified SE. A key case study on power iteration with partial updates in a spiked matrix model demonstrates how partial data access can be analyzed and accelerated, with SE accurately predicting performance under full, round-robin, and random update schemes. Numerical experiments corroborate the theory, showing that partial-update schedules can be more efficient than full updates in terms of computational effort while achieving the same fixed points, informing distributed AMP design and scheduling strategies.

Abstract

This paper introduces a framework for approximate message passing (AMP) in dynamic settings where the data at each iteration is passed through a linear operator. This framework is motivated in part by applications in large-scale, distributed computing where only a subset of the data is available at each iteration. An autoregressive memory term is used to mitigate information loss across iterations and a specialized algorithm, called projection AMP, is designed for the case where each linear operator is an orthogonal projection. Precise theoretical guarantees are provided for a class of Gaussian matrices and non-separable denoising functions. Specifically, it is shown that the iterates can be well-approximated in the high-dimensional limit by a Gaussian process whose second-order statistics are defined recursively via state evolution. These results are applied to the problem of estimating a rank-one spike corrupted by additive Gaussian noise using partial row updates, and the theory is validated by numerical simulations.
Paper Structure (26 sections, 7 theorems, 131 equations, 5 figures, 1 algorithm)

This paper contains 26 sections, 7 theorems, 131 equations, 5 figures, 1 algorithm.

Key Result

Theorem 1

Let $\{x_t\}$ be generated by eq:opAMP and let $\{y_t\}$ be the zero-mean Gaussian process defined by the SE eq:SEopAMP. Suppose Assumptions ass:ft and ass:Lt hold, $Z \sim \mathsf{GOE}(n)$, and the matrices $\{B_{ts} : 0 \le s < t \}$ are given by Here, the notation $\mathsf{D}_s$ indicates the Jacobian matrix of $f_t(x_{<t})$ computed w.r.t. the input vector $x_s$. Then, for any fixed number of

Figures (5)

  • Figure 1: Block diagram for the basic approximate message passing (AMP) recursion with full memory. The notation $x_{<t} = (x_0,\ldots,x_{t-1})$ compactly refers to the collection of iterates up to times $t-1$.
  • Figure 2: Block diagram for the linear operator approximate message passing (OpAMP) recursion with full memory. The notation $x_{<t} = (x_0,\ldots,x_{t-1})$ compactly refers to the collection of iterates up to times $t-1$.
  • Figure 3: Block diagram for the projection approximate message passing (AMP) recursion with one-step memory.
  • Figure 4: Correlation $\frac{1}{n} \langle \theta, \hat{\theta}_t \rangle$ attained after $t$ iterations for the full-matrix, round-robin, and random-update protocols. The denoising function is simply projection onto the $\sqrt{n}$-sphere $f_t(x) = \sqrt{n}/\|x\|\, x$, meaning that, modulo the correction term, these protocols are variations on power iteration. The curves represent the theoretical predictions from the state evolution from Theorem \ref{['thm:projAMP']} and the marks represent the empirical performance averaged over $100$ trials, starting with the same initialization. All three protocols converge to the same fixed point. On the top, the correlation is plotted with respect to the number of iterations, for which the full-matrix updates are intuitively faster. On the bottom, the same data is plotted with respect to the number of effective multiplications by an $n \times n$ matrix, which serves as a proxy for the computational complexity. Under this metric, the round-robin protocol is the most efficient.
  • Figure 5: Correlation $\frac{1}{n} \langle \theta, \hat{\theta}_t \rangle$ attained after $t$ iterations for the full matrix, round robin (updating $\frac{n}{10}$ rows per iteration), and random update (updating a row with probability $\frac{1}{10}$) protocols. The denoising functions $f_t$ are the separable, Bayes-optimal denoisers described in \ref{['eq:cond_mean']}. All three protocols converge to the same fixed point and, as expected, full matrix updates requires fewer iterations to converge. Markers denote the average value over $100$ trials, standard errors were omitted as they were of negligible size. Solid and dashed curves denote the theoretically-predicted asymptotic overlap $\rho_t$ evaluated from \ref{['eq:bayesSE']} using the state evolution.

Theorems & Definitions (13)

  • Theorem 1
  • Remark 1
  • Theorem 2
  • Theorem 3
  • Remark 2
  • Theorem 4
  • Remark 3
  • Theorem 5
  • Theorem 6
  • proof : Proof: (i) + (ii) $\implies$ (iii)
  • ...and 3 more