Table of Contents
Fetching ...

Optimal Estimation in Orthogonally Invariant Generalized Linear Models: Spectral Initialization and Approximate Message Passing

Yihan Zhang, Hong Chang Ji, Ramji Venkataramanan, Marco Mondelli

TL;DR

This work studies parameter estimation in generalized linear models with orthogonally invariant random designs, proposing a spectrally informed initialization and a Generalized Vector AMP (GVAMP) algorithmic framework. It proves that optimal spectral estimators reach a universal weak recovery threshold and that GVAMP dynamics, when initialized spectrally, admit a tractable state evolution describing their asymptotic behavior. A spectrally initialized Bayes-GVAMP variant is introduced whose fixed points align with the Bayes risk conjectured via replica analysis, suggesting statistical optimality within a broad efficiently computable class. Empirical results on synthetic data and real datasets such as GTEx and coded diffraction patterns demonstrate robust universality and superiority over i.i.d.-based methods, underscoring the practical impact of spectral initialization tied to GVAMP in complex correlated designs.

Abstract

We consider the problem of parameter estimation from a generalized linear model with a random design matrix that is orthogonally invariant in law. Such a model allows the design have an arbitrary distribution of singular values and only assumes that its singular vectors are generic. It is a vast generalization of the i.i.d. Gaussian design typically considered in the theoretical literature, and is motivated by the fact that real data often have a complex correlation structure so that methods relying on i.i.d. assumptions can be highly suboptimal. Building on the paradigm of spectrally-initialized iterative optimization, this paper proposes optimal spectral estimators and combines them with an approximate message passing (AMP) algorithm, establishing rigorous performance guarantees for these two algorithmic steps. Both the spectral initialization and the subsequent AMP meet existing conjectures on the fundamental limits to estimation -- the former on the optimal sample complexity for efficient weak recovery, and the latter on the optimal errors. Numerical experiments suggest the effectiveness of our methods and accuracy of our theory beyond orthogonally invariant data.

Optimal Estimation in Orthogonally Invariant Generalized Linear Models: Spectral Initialization and Approximate Message Passing

TL;DR

This work studies parameter estimation in generalized linear models with orthogonally invariant random designs, proposing a spectrally informed initialization and a Generalized Vector AMP (GVAMP) algorithmic framework. It proves that optimal spectral estimators reach a universal weak recovery threshold and that GVAMP dynamics, when initialized spectrally, admit a tractable state evolution describing their asymptotic behavior. A spectrally initialized Bayes-GVAMP variant is introduced whose fixed points align with the Bayes risk conjectured via replica analysis, suggesting statistical optimality within a broad efficiently computable class. Empirical results on synthetic data and real datasets such as GTEx and coded diffraction patterns demonstrate robust universality and superiority over i.i.d.-based methods, underscoring the practical impact of spectral initialization tied to GVAMP in complex correlated designs.

Abstract

We consider the problem of parameter estimation from a generalized linear model with a random design matrix that is orthogonally invariant in law. Such a model allows the design have an arbitrary distribution of singular values and only assumes that its singular vectors are generic. It is a vast generalization of the i.i.d. Gaussian design typically considered in the theoretical literature, and is motivated by the fact that real data often have a complex correlation structure so that methods relying on i.i.d. assumptions can be highly suboptimal. Building on the paradigm of spectrally-initialized iterative optimization, this paper proposes optimal spectral estimators and combines them with an approximate message passing (AMP) algorithm, establishing rigorous performance guarantees for these two algorithmic steps. Both the spectral initialization and the subsequent AMP meet existing conjectures on the fundamental limits to estimation -- the former on the optimal sample complexity for efficient weak recovery, and the latter on the optimal errors. Numerical experiments suggest the effectiveness of our methods and accuracy of our theory beyond orthogonally invariant data.
Paper Structure (88 sections, 45 theorems, 665 equations, 8 figures)

This paper contains 88 sections, 45 theorems, 665 equations, 8 figures.

Key Result

Theorem 4.1

Consider the spectral estimator defined through eqn:spec_est using a preprocessing function subject to asmp:preprocess. Then we have almost surely, where $\lambda^\circ$ is defined in eqn:def_lambda2. Furthermore, if then we have the following: where $\eta$ is defined in eqn:eta.

Figures (8)

  • Figure 1: Performance of $5$ algorithms on phase retrieval whose design matrix is given by one of two real datasets from GTEx, see \ref{['sec:experiments']} for details. The GAMP algorithm Mondelli_Venkataramanan (in yellow) with provable optimality on Gaussian data is no longer as effective. The proposed Bayes-GVAMP ('AMP' in red) with spectral initialization ('spec' in black) shows dominant performance which is well predicted by our theory under the orthogonal invariance assumption. Gradient descent ('GD' in blue) and GAMP are both initialized with the spectral estimator from Luo_Alghamdi_Lu ('spec conj' in green). They achieve the same weak recovery threshold as spectrally initialized Bayes-GVAMP but worse overlaps.
  • Figure 2: Performance of $5$ algorithms on phase retrieval whose design matrix is given by either binary or ternary coded diffraction patterns (CDP), see \ref{['sec:experiments']} for details. The regression coefficients are obtained by applying standard preprocessing to a $32\times32$ image of a truck from the CIFAR-10 dataset. Our theoretical guarantees ('thy', solid curves) remain accurate when neither the design matrix is orthogonally invariant nor the prior is i.i.d., a strong sign of universality.
  • Figure 3: The trajectory of spectrally initialized Bayes-GVAMP in \ref{['eqn:Bayes_GVAMP']} on phase retrieval is accurately tracked by its state evolution. Both trajectories are plotted for $3$ spectral distributions in \ref{['eqn:eg_Lambda']} at $\delta = 2$.
  • Figure 4: Plots of the function ${\mathcal{F}}$ and its fixed points, i.e. solution $v\in[0,\rho]$ to $v = {\mathcal{F}}(v)$, for $3$ spectral distributions \ref{['eqn:eg_Lambda']}. From the leftmost to the rightmost panel, $\delta$ is taken to be $1.44$, $1$, $1.34$, respectively.
  • Figure 5: Asymptotic overlap \ref{['eqn:eta']} achieved by the spectral estimator in \ref{['thm:opt_thr']} ('spec' in black), overlap achieved by the state evolution of spectrally initialized Bayes-GVAMP in \ref{['eqn:Bayes_GVAMP']} run till convergence ('SE' in red), and the conjectured Bayes risk \ref{['eqn:replica']} expressed in terms of overlap ('replica' in blue). All curves are plotted for $3$ spectral distributions given in \ref{['eqn:eg_Lambda']}.
  • ...and 3 more figures

Theorems & Definitions (75)

  • Definition 3.1: Pseudo-Lipschitz functions
  • Definition 3.2: Polynomial growth
  • Definition 3.3: Wasserstein convergence, Villani_book
  • Theorem 4.1: Spectral estimator
  • Remark 4.1: Universality
  • Theorem 4.2: Optimal spectral threshold and preprocessing
  • Remark 4.2: Conjectured computationally optimal weak recovery threshold
  • Remark 4.3: Effect of non-Gaussian design on spectral threshold
  • Theorem 4.3: State evolution of spectrally initialized GVAMP
  • Remark 4.4: Choice of initializer
  • ...and 65 more