Table of Contents
Fetching ...

Multi-layer State Evolution Under Random Convolutional Design

Mara Daniels, Cédric Gerbelot, Florent Krzakala, Lenka Zdeborová

TL;DR

The paper addresses recovering signals from multi-layer generative priors with convolutional layers by analyzing Multi-layer AMP (ML-AMP) under random MCC designs. It shows a universality result: the state evolution (SE) describing ML-AMP with MCC weights matches the SE for dense Gaussian weights up to a rescaling, achieved via a permutation-based embedding of MCCs into block-Gaussian structures and leveraging spatially coupled SE techniques. This yields precise performance predictions and justifies using structured, efficient MCCs in place of fully dense matrices, with empirical validation on sparse and multi-layer priors. The findings enable scalable, theoretically principled inference with convolutional priors and have practical impact for computational imaging and neural-prior-based recovery.

Abstract

Signal recovery under generative neural network priors has emerged as a promising direction in statistical inference and computational imaging. Theoretical analysis of reconstruction algorithms under generative priors is, however, challenging. For generative priors with fully connected layers and Gaussian i.i.d. weights, this was achieved by the multi-layer approximate message (ML-AMP) algorithm via a rigorous state evolution. However, practical generative priors are typically convolutional, allowing for computational benefits and inductive biases, and so the Gaussian i.i.d. weight assumption is very limiting. In this paper, we overcome this limitation and establish the state evolution of ML-AMP for random convolutional layers. We prove in particular that random convolutional layers belong to the same universality class as Gaussian matrices. Our proof technique is of an independent interest as it establishes a mapping between convolutional matrices and spatially coupled sensing matrices used in coding theory.

Multi-layer State Evolution Under Random Convolutional Design

TL;DR

The paper addresses recovering signals from multi-layer generative priors with convolutional layers by analyzing Multi-layer AMP (ML-AMP) under random MCC designs. It shows a universality result: the state evolution (SE) describing ML-AMP with MCC weights matches the SE for dense Gaussian weights up to a rescaling, achieved via a permutation-based embedding of MCCs into block-Gaussian structures and leveraging spatially coupled SE techniques. This yields precise performance predictions and justifies using structured, efficient MCCs in place of fully dense matrices, with empirical validation on sparse and multi-layer priors. The findings enable scalable, theoretically principled inference with convolutional priors and have practical impact for computational imaging and neural-prior-based recovery.

Abstract

Signal recovery under generative neural network priors has emerged as a promising direction in statistical inference and computational imaging. Theoretical analysis of reconstruction algorithms under generative priors is, however, challenging. For generative priors with fully connected layers and Gaussian i.i.d. weights, this was achieved by the multi-layer approximate message (ML-AMP) algorithm via a rigorous state evolution. However, practical generative priors are typically convolutional, allowing for computational benefits and inductive biases, and so the Gaussian i.i.d. weight assumption is very limiting. In this paper, we overcome this limitation and establish the state evolution of ML-AMP for random convolutional layers. We prove in particular that random convolutional layers belong to the same universality class as Gaussian matrices. Our proof technique is of an independent interest as it establishes a mapping between convolutional matrices and spatially coupled sensing matrices used in coding theory.
Paper Structure (21 sections, 5 theorems, 58 equations, 7 figures, 1 algorithm)

This paper contains 21 sections, 5 theorems, 58 equations, 7 figures, 1 algorithm.

Key Result

Theorem 4.2

Under the set of assumptions (A1)-(A4), for any sequences of uniformly pseudo-Lipschitz functions $\psi^{N}_{1},\psi^{N}_{2}$ of order $k$, for any $1 \leqslant l \leqslant L$ and any $t \in \mathbb{N}$, the following holds where $Z^{l}(t) \sim \mathcal{N}(0,\kappa^{l}(t))$, $\hat{Z}^{l}(t) \sim \mathcal{N}(0,\hat{\kappa}^{l}(t))$ are independent random variables.

Figures (7)

  • Figure 1: Agreement between the performance of the AMP algorithm run with random multichannel convolutional matrices and its state evolution as proven in this paper. (left) Compressive sensing $y_0 = W x_0 + \zeta$ for noise $\zeta_i \sim \mathcal{N}(0, 10^{-4})$ and signal prior $x_0 \sim \rho \mathcal{N}(0, 1) + (1-\rho) \delta(x)$, where $W \in \mathbb{R}^{Dq \times Pq}$ has varying aspect ratio $\beta = D / P$. Crosses correspond to AMP evaluations for $W \sim \text{MCC}(D, P, q, k)$ according to Definition \ref{['dfn:mcc']}, averaged over 10 independent trials. Dots correspond to AMP evaluations for $W \in \mathbb{R}^{D \times P}$ with i.i.d. Gaussian entries $W_{ij} \sim \mathcal{N}(0, 1/P)$. Lines show the state evolution predictions when $W_{ij} \sim \mathcal{N}(0,1/Pq)$. The system size is $P = 1024$, $q=1024$, $k=3$, where $\beta$ and $D = \beta P$ vary. While our theorem treats the limit $P, D \to \infty$, $q, k = O(1)$, we observe strong empirical agreement even when $q \sim P$. In Appendix \ref{['sec:q10-sparse-cs']} we give the same figure for $q=10 \ll P$. (right) AMP iterates at $\rho = 0.25$ and $\beta$ near the recovery transition. Rather than showing these models have equivalent fixed points, we show a stronger result: the state evolution equations are equivalent at each iteration.
  • Figure 2: MCC matrices operate on $Pq$ dimensional input data, composed of $q$-dimensional signals for each of $P$ separate channels. The $i$-th output channel is a linear combination of convolutional features extracted from input channels, where $k$ is the convolutional filter size: $y^{(i)} = \sum_{j = 1 \ldots P} C_{ij} x^{(j)}$. Blue boxes show linear dependencies between signal coordinates.
  • Figure 3: System sizes for convolutional layers in a DC-GAN architecture used to generate LSUN images radford2015unsupervised. These are not directly comparable to MCC matrices, as DCGAN uses fractionally strided convolutions, which can be thought of as a composition of an MCC matrix with superresolution. However, they give a reasonable picture of the sizes of typical layers in convolutional neural networks.
  • Figure 4: A sketch of the permutation lemma applied to matrix $W \sim \text{MCC}(4, 3, 3, 2)$. Left: $W$ before permutation. Right: after permutation, $U W \tilde{U}^T$.
  • Figure 5: ML-AMP compressive sensing recovery under multichannel convolutional designs (crossed) and the state evolution for the corresponding fully connected model (lined). For comparison, we also plot the corresponding fully connected AMP iterations (dotted), in which $W^{(l)} \in \mathbb{R}^{D_l \times P_l}$ with $W_{ij} \sim \mathcal{N}(0, 1/P_l)$, with the dimensions of the prior and output channel adjusted appropriately. Left: For $2 \leq l \leq L$, the channel functions are $\varphi^{(l)}(z; \zeta) = z + \zeta$ where $\zeta_i \sim \mathcal{N}(0, \sigma^2)$. Right: For $2 \leq l \leq L$, the channel functions are $\varphi^{(l)}(z; \zeta) = \max(z, 0)$ where the maximum is applied coordinatewise. This channel function is the popular ReLU activation function used by generative convolutional neural networks such as in radford2015unsupervisedbora2017compressed.
  • ...and 2 more figures

Theorems & Definitions (14)

  • Definition 3.1: Gaussian i.i.d. Convolution
  • Definition 3.2: Multi-channel Gaussian i.i.d. Convolution
  • Definition 4.1: State Evolution
  • Theorem 4.2
  • Lemma 4.3: Permutation Lemma
  • proof
  • Definition A.1: pseudo-Lipschitz function
  • Theorem A.2
  • proof
  • Lemma A.3
  • ...and 4 more