Table of Contents
Fetching ...

A non-asymptotic distributional theory of approximate message passing for sparse and robust regression

Gen Li, Yuting Wei

TL;DR

The paper addresses the challenge of characterizing the finite-sample distributional behavior of high-dimensional estimators in sparse and robust regression. It develops a non-asymptotic AMP framework that decomposes AMP iterates into a Gaussian component plus controllable residuals, enabling faithful Gaussian approximations for a polynomial number of iterations. The results yield non-asymptotic distributional guarantees for both optimally tuned Lasso in sparse regression and robust M-estimators, including explicit bounds on Wasserstein distances and residual norms. This advances uncertainty quantification and inference in high dimensions by providing precise finite-sample characterizations that surpass previous asymptotic SE-based analyses. The framework also highlights a general recipe for analyzing AMP in non-symmetric designs and opens avenues for extensions to broader designs and tighter residual bounds, with strong practical implications for high-dimensional regression problems.

Abstract

Characterizing the distribution of high-dimensional statistical estimators is a challenging task, due to the breakdown of classical asymptotic theory in high dimension. This paper makes progress towards this by developing non-asymptotic distributional characterizations for approximate message passing (AMP) -- a family of iterative algorithms that prove effective as both fast estimators and powerful theoretical machinery -- for both sparse and robust regression. Prior AMP theory, which focused on high-dimensional asymptotics for the most part, failed to describe the behavior of AMP when the number of iterations exceeds $o\big({\log n}/{\log \log n}\big)$ (with $n$ the sample size). We establish the first finite-sample non-asymptotic distributional theory of AMP for both sparse and robust regression that accommodates a polynomial number of iterations. Our results derive approximate accuracy of Gaussian approximation of the AMP iterates, which improves upon all prior results and implies enhanced distributional characterizations for both optimally tuned Lasso and robust M-estimator.

A non-asymptotic distributional theory of approximate message passing for sparse and robust regression

TL;DR

The paper addresses the challenge of characterizing the finite-sample distributional behavior of high-dimensional estimators in sparse and robust regression. It develops a non-asymptotic AMP framework that decomposes AMP iterates into a Gaussian component plus controllable residuals, enabling faithful Gaussian approximations for a polynomial number of iterations. The results yield non-asymptotic distributional guarantees for both optimally tuned Lasso in sparse regression and robust M-estimators, including explicit bounds on Wasserstein distances and residual norms. This advances uncertainty quantification and inference in high dimensions by providing precise finite-sample characterizations that surpass previous asymptotic SE-based analyses. The framework also highlights a general recipe for analyzing AMP in non-symmetric designs and opens avenues for extensions to broader designs and tighter residual bounds, with strong practical implications for high-dimensional regression problems.

Abstract

Characterizing the distribution of high-dimensional statistical estimators is a challenging task, due to the breakdown of classical asymptotic theory in high dimension. This paper makes progress towards this by developing non-asymptotic distributional characterizations for approximate message passing (AMP) -- a family of iterative algorithms that prove effective as both fast estimators and powerful theoretical machinery -- for both sparse and robust regression. Prior AMP theory, which focused on high-dimensional asymptotics for the most part, failed to describe the behavior of AMP when the number of iterations exceeds (with the sample size). We establish the first finite-sample non-asymptotic distributional theory of AMP for both sparse and robust regression that accommodates a polynomial number of iterations. Our results derive approximate accuracy of Gaussian approximation of the AMP iterates, which improves upon all prior results and implies enhanced distributional characterizations for both optimally tuned Lasso and robust M-estimator.
Paper Structure (87 sections, 13 theorems, 365 equations, 2 figures)

This paper contains 87 sections, 13 theorems, 365 equations, 2 figures.

Key Result

Theorem 1

Consider the linear model eqn:linear under i.i.d. Gaussian design (i.e., $X_{ij} \overset{\mathrm{i.i.d.}}{\sim} \mathcal{N}(0,1/n)$). Suppose the functions $\{G_t\}$ and $\{F_{t}\}$ are differentiable except at a finite number of points. For any $1 \leq t \leq \min\{n,p\},$ the AMP sequence defined where

Figures (2)

  • Figure 1: Numerical calculations for $H_1(\omega)$ and $H_2(\omega)$ of \ref{['eq:defi-H-1']} and \ref{['eq:defi-H-2']} such that $p/k \ge 2.3$.
  • Figure 2: Numerical calculations for $H_1(\tau)$ of \ref{['eq:defi-H1']} with $\tau \in (0,5)$ and $H_2(\tau)$ of \ref{['eq:defi-H2']} with $\tau \in (0,3)$.

Theorems & Definitions (26)

  • Theorem 1
  • Remark 1
  • Remark 2
  • Theorem 2
  • Remark 3
  • Remark 4
  • Remark 5
  • Remark 6
  • Theorem 3
  • Theorem 4
  • ...and 16 more