A non-asymptotic distributional theory of approximate message passing for sparse and robust regression
Gen Li, Yuting Wei
TL;DR
The paper addresses the challenge of characterizing the finite-sample distributional behavior of high-dimensional estimators in sparse and robust regression. It develops a non-asymptotic AMP framework that decomposes AMP iterates into a Gaussian component plus controllable residuals, enabling faithful Gaussian approximations for a polynomial number of iterations. The results yield non-asymptotic distributional guarantees for both optimally tuned Lasso in sparse regression and robust M-estimators, including explicit bounds on Wasserstein distances and residual norms. This advances uncertainty quantification and inference in high dimensions by providing precise finite-sample characterizations that surpass previous asymptotic SE-based analyses. The framework also highlights a general recipe for analyzing AMP in non-symmetric designs and opens avenues for extensions to broader designs and tighter residual bounds, with strong practical implications for high-dimensional regression problems.
Abstract
Characterizing the distribution of high-dimensional statistical estimators is a challenging task, due to the breakdown of classical asymptotic theory in high dimension. This paper makes progress towards this by developing non-asymptotic distributional characterizations for approximate message passing (AMP) -- a family of iterative algorithms that prove effective as both fast estimators and powerful theoretical machinery -- for both sparse and robust regression. Prior AMP theory, which focused on high-dimensional asymptotics for the most part, failed to describe the behavior of AMP when the number of iterations exceeds $o\big({\log n}/{\log \log n}\big)$ (with $n$ the sample size). We establish the first finite-sample non-asymptotic distributional theory of AMP for both sparse and robust regression that accommodates a polynomial number of iterations. Our results derive approximate accuracy of Gaussian approximation of the AMP iterates, which improves upon all prior results and implies enhanced distributional characterizations for both optimally tuned Lasso and robust M-estimator.
