Optimal Estimation in Orthogonally Invariant Generalized Linear Models: Spectral Initialization and Approximate Message Passing

Yihan Zhang; Hong Chang Ji; Ramji Venkataramanan; Marco Mondelli

Optimal Estimation in Orthogonally Invariant Generalized Linear Models: Spectral Initialization and Approximate Message Passing

Yihan Zhang, Hong Chang Ji, Ramji Venkataramanan, Marco Mondelli

TL;DR

This work studies parameter estimation in generalized linear models with orthogonally invariant random designs, proposing a spectrally informed initialization and a Generalized Vector AMP (GVAMP) algorithmic framework. It proves that optimal spectral estimators reach a universal weak recovery threshold and that GVAMP dynamics, when initialized spectrally, admit a tractable state evolution describing their asymptotic behavior. A spectrally initialized Bayes-GVAMP variant is introduced whose fixed points align with the Bayes risk conjectured via replica analysis, suggesting statistical optimality within a broad efficiently computable class. Empirical results on synthetic data and real datasets such as GTEx and coded diffraction patterns demonstrate robust universality and superiority over i.i.d.-based methods, underscoring the practical impact of spectral initialization tied to GVAMP in complex correlated designs.

Abstract

We consider the problem of parameter estimation from a generalized linear model with a random design matrix that is orthogonally invariant in law. Such a model allows the design have an arbitrary distribution of singular values and only assumes that its singular vectors are generic. It is a vast generalization of the i.i.d. Gaussian design typically considered in the theoretical literature, and is motivated by the fact that real data often have a complex correlation structure so that methods relying on i.i.d. assumptions can be highly suboptimal. Building on the paradigm of spectrally-initialized iterative optimization, this paper proposes optimal spectral estimators and combines them with an approximate message passing (AMP) algorithm, establishing rigorous performance guarantees for these two algorithmic steps. Both the spectral initialization and the subsequent AMP meet existing conjectures on the fundamental limits to estimation -- the former on the optimal sample complexity for efficient weak recovery, and the latter on the optimal errors. Numerical experiments suggest the effectiveness of our methods and accuracy of our theory beyond orthogonally invariant data.

Optimal Estimation in Orthogonally Invariant Generalized Linear Models: Spectral Initialization and Approximate Message Passing

TL;DR

Abstract

Paper Structure (88 sections, 45 theorems, 665 equations, 8 figures)

This paper contains 88 sections, 45 theorems, 665 equations, 8 figures.

Introduction
Contributions.
Related work
Spectral methods.
Approximate Message Passing.
Bayes risk.
Preliminaries
Notation and definitions
Model
Main results
Optimal spectral estimators
Spectrally initialized GVAMP and its state evolution
State Evolution.
Spectrally initialized Bayes-GVAMP and its state evolution
Conjectured Bayes risk
...and 73 more sections

Key Result

Theorem 4.1

Consider the spectral estimator defined through eqn:spec_est using a preprocessing function subject to asmp:preprocess. Then we have almost surely, where $\lambda^\circ$ is defined in eqn:def_lambda2. Furthermore, if then we have the following: where $\eta$ is defined in eqn:eta.

Figures (8)

Figure 1: Performance of $5$ algorithms on phase retrieval whose design matrix is given by one of two real datasets from GTEx, see \ref{['sec:experiments']} for details. The GAMP algorithm Mondelli_Venkataramanan (in yellow) with provable optimality on Gaussian data is no longer as effective. The proposed Bayes-GVAMP ('AMP' in red) with spectral initialization ('spec' in black) shows dominant performance which is well predicted by our theory under the orthogonal invariance assumption. Gradient descent ('GD' in blue) and GAMP are both initialized with the spectral estimator from Luo_Alghamdi_Lu ('spec conj' in green). They achieve the same weak recovery threshold as spectrally initialized Bayes-GVAMP but worse overlaps.
Figure 2: Performance of $5$ algorithms on phase retrieval whose design matrix is given by either binary or ternary coded diffraction patterns (CDP), see \ref{['sec:experiments']} for details. The regression coefficients are obtained by applying standard preprocessing to a $32\times32$ image of a truck from the CIFAR-10 dataset. Our theoretical guarantees ('thy', solid curves) remain accurate when neither the design matrix is orthogonally invariant nor the prior is i.i.d., a strong sign of universality.
Figure 3: The trajectory of spectrally initialized Bayes-GVAMP in \ref{['eqn:Bayes_GVAMP']} on phase retrieval is accurately tracked by its state evolution. Both trajectories are plotted for $3$ spectral distributions in \ref{['eqn:eg_Lambda']} at $\delta = 2$.
Figure 4: Plots of the function ${\mathcal{F}}$ and its fixed points, i.e. solution $v\in[0,\rho]$ to $v = {\mathcal{F}}(v)$, for $3$ spectral distributions \ref{['eqn:eg_Lambda']}. From the leftmost to the rightmost panel, $\delta$ is taken to be $1.44$, $1$, $1.34$, respectively.
Figure 5: Asymptotic overlap \ref{['eqn:eta']} achieved by the spectral estimator in \ref{['thm:opt_thr']} ('spec' in black), overlap achieved by the state evolution of spectrally initialized Bayes-GVAMP in \ref{['eqn:Bayes_GVAMP']} run till convergence ('SE' in red), and the conjectured Bayes risk \ref{['eqn:replica']} expressed in terms of overlap ('replica' in blue). All curves are plotted for $3$ spectral distributions given in \ref{['eqn:eg_Lambda']}.
...and 3 more figures

Theorems & Definitions (75)

Definition 3.1: Pseudo-Lipschitz functions
Definition 3.2: Polynomial growth
Definition 3.3: Wasserstein convergence, Villani_book
Theorem 4.1: Spectral estimator
Remark 4.1: Universality
Theorem 4.2: Optimal spectral threshold and preprocessing
Remark 4.2: Conjectured computationally optimal weak recovery threshold
Remark 4.3: Effect of non-Gaussian design on spectral threshold
Theorem 4.3: State evolution of spectrally initialized GVAMP
Remark 4.4: Choice of initializer
...and 65 more

Optimal Estimation in Orthogonally Invariant Generalized Linear Models: Spectral Initialization and Approximate Message Passing

TL;DR

Abstract

Optimal Estimation in Orthogonally Invariant Generalized Linear Models: Spectral Initialization and Approximate Message Passing

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (75)