Table of Contents
Fetching ...

A Convergence Analysis of Approximate Message Passing with Non-Separable Functions and Applications to Multi-Class Classification

Burak Çakmak, Yue M. Lu, Manfred Opper

TL;DR

The paper analyzes convergence of approximate message passing (AMP) with non-separable vector-valued nonlinearities in high-dimensional convex optimization, motivated by multi-class classification. It establishes a contraction-mapping framework via state evolution and an Almeida–Thouless (AT) stability criterion, showing contraction when $\rho_{\rm AT}<1$ and deriving explicit decay rates. The authors apply this to convex losses with a proximal operator, deriving fixed-point equations for the state-evolution statistics and proving the AMP fixed point coincides with the optimizer under suitable smoothness and convexity assumptions; they also quantify asymptotic reconstruction error in terms of fixed-point covariances. Simulation results with cross-entropy loss validate the AT-based predictions and demonstrate scalability via the Householder-dice technique, illustrating convergence behavior as the regularization parameter varies. The work provides a rigorous foundation for using AMP in non-separable, multivariate settings and offers guidance for designing stable AMP-based methods in high-dimensional learning tasks.

Abstract

Motivated by the recent application of approximate message passing (AMP) to the analysis of convex optimizations in multi-class classifications [Loureiro, et. al., 2021], we present a convergence analysis of AMP dynamics with non-separable multivariate nonlinearities. As an application, we present a complete (and independent) analysis of the motivated convex optimization problem.

A Convergence Analysis of Approximate Message Passing with Non-Separable Functions and Applications to Multi-Class Classification

TL;DR

The paper analyzes convergence of approximate message passing (AMP) with non-separable vector-valued nonlinearities in high-dimensional convex optimization, motivated by multi-class classification. It establishes a contraction-mapping framework via state evolution and an Almeida–Thouless (AT) stability criterion, showing contraction when and deriving explicit decay rates. The authors apply this to convex losses with a proximal operator, deriving fixed-point equations for the state-evolution statistics and proving the AMP fixed point coincides with the optimizer under suitable smoothness and convexity assumptions; they also quantify asymptotic reconstruction error in terms of fixed-point covariances. Simulation results with cross-entropy loss validate the AT-based predictions and demonstrate scalability via the Householder-dice technique, illustrating convergence behavior as the regularization parameter varies. The work provides a rigorous foundation for using AMP in non-separable, multivariate settings and offers guidance for designing stable AMP-based methods in high-dimensional learning tasks.

Abstract

Motivated by the recent application of approximate message passing (AMP) to the analysis of convex optimizations in multi-class classifications [Loureiro, et. al., 2021], we present a convergence analysis of AMP dynamics with non-separable multivariate nonlinearities. As an application, we present a complete (and independent) analysis of the motivated convex optimization problem.
Paper Structure (18 sections, 13 theorems, 112 equations, 1 figure)

This paper contains 18 sections, 13 theorems, 112 equations, 1 figure.

Key Result

Proposition 1

Let ${\hbox{\boldmath$X$}}\sim_{\text{i.i.d.}}\mathcal{N}({\hbox{\boldmath$0$}},{\hbox{\boldmath$I$}}/d)$. Let $f(\gamma;y)$ be differentiable and Lipschitz continuous w.r.t $\gamma$ and $f(0;Y)=\mathcal{O}(1)$ where $Y$ as in YG0. Define ${\hbox{\boldmath$g$}}_{0}\stackrel{\Delta}{=} {\hbox{\boldma where the two sequences $\{\tilde{{\hbox{\boldmath$\psi$}}}^{(t)}\}_{t\in[T]}$ and $\{{{\hbox{\bold

Figures (1)

  • Figure 1: The convergence of the AMP dynamics with $d=10^{5}$ and $\alpha=2$. The straight lines on the interval $10\leq t\leq 15$ represent $\rho_{\rm AT}^{t}$. The experiments are based on single instances (for each $\lambda_0$) of the AMP dynamics.

Theorems & Definitions (28)

  • Definition 1: State Evolution
  • Proposition 1: Decoupling Principle
  • proof
  • Theorem 1
  • Lemma 1
  • proof
  • Remark 1
  • Definition 2
  • Remark 2
  • proof
  • ...and 18 more