Table of Contents
Fetching ...

The Lasso error is bounded iff its active set size is bounded away from n in the proportional regime

Pierre C. Bellec

TL;DR

This work analyzes the Lasso in the proportional regime with Gaussian design and well-conditioned covariance, proving a tight equivalence: the L2 risk $\|\hat{b}-b^*\|_2$ is $O_P(1)$ if and only if the active set size $\|\hat{b}\|_0$ stays bounded away from $n$, independently of sparsity in $b^*$. The authors supply a phase-transition characterization based on Gaussian width and cone geometry, and they offer alternative, more direct proofs of the bounded/unbounded risk regimes via a restricted eigenvalue framework and a Basis Pursuit failure argument, avoiding fixed-point equations. These results reveal a fundamental link between sparsity patterns, degrees of freedom, and generalization in high-dimensional linear models, with practical implications for tuning parameters that yield dense solutions. The analysis extends to non-isotropic designs with spectrum-bounded covariance and clarifies the dense-vs-sparse risk landscape in the proportional setting, connecting to established phase-transition results such as the Donoho–Tanner transition.

Abstract

This note develops an analysis of the Lasso \( \hat b\) in linear models without any sparsity or L1 assumption on the true regression vector, in the proportional regime where dimension \( p \) and sample \( n \) are of the same order. Under Gaussian design and covariance matrix with spectrum bounded away from 0 and $+\infty$, it is shown that the L2 risk is stochastically bounded if and only if the number of selected variables is bounded away from \( n \), in the sense that $$ (1-\|\hat b\|_0/n)^{-1} = O_P(1) \Longleftrightarrow \|\hat b- b^*\|_2 = O_P(1) $$ as \( n,p\to+\infty \). The right-to-left implication rules out constant risk for dense Lasso estimates (estimates with close to $n$ active variables), which can be used to discard tuning parameters leading to dense estimates. We then bring back sparsity in the picture, and revisit the precise phase transition characterizing the sparsity patterns of the true regression vector leading to unbounded Lasso risk -- or by the above equivalence to dense Lasso estimates. This precise phase transition was established by \citet{miolane2018distribution,celentano2020lasso} using fixed-point equations in an equivalent sequence model. An alternative proof of this phase transition is provided here using simple arguments without relying on the fixed-point equations or the equivalent sequence model. A modification of the well-known Restricted Eigenvalue argument allows to extend the analysis to any small tuning parameter of constant order, leading to a bounded risk on one side of the phase transition. On the other side of the phase transition, it is established the Lasso risk can be unbounded for a given sign pattern as soon as Basis Pursuit fails to recover that sign pattern in noiseless problems.

The Lasso error is bounded iff its active set size is bounded away from n in the proportional regime

TL;DR

This work analyzes the Lasso in the proportional regime with Gaussian design and well-conditioned covariance, proving a tight equivalence: the L2 risk is if and only if the active set size stays bounded away from , independently of sparsity in . The authors supply a phase-transition characterization based on Gaussian width and cone geometry, and they offer alternative, more direct proofs of the bounded/unbounded risk regimes via a restricted eigenvalue framework and a Basis Pursuit failure argument, avoiding fixed-point equations. These results reveal a fundamental link between sparsity patterns, degrees of freedom, and generalization in high-dimensional linear models, with practical implications for tuning parameters that yield dense solutions. The analysis extends to non-isotropic designs with spectrum-bounded covariance and clarifies the dense-vs-sparse risk landscape in the proportional setting, connecting to established phase-transition results such as the Donoho–Tanner transition.

Abstract

This note develops an analysis of the Lasso in linear models without any sparsity or L1 assumption on the true regression vector, in the proportional regime where dimension and sample are of the same order. Under Gaussian design and covariance matrix with spectrum bounded away from 0 and , it is shown that the L2 risk is stochastically bounded if and only if the number of selected variables is bounded away from , in the sense that as . The right-to-left implication rules out constant risk for dense Lasso estimates (estimates with close to active variables), which can be used to discard tuning parameters leading to dense estimates. We then bring back sparsity in the picture, and revisit the precise phase transition characterizing the sparsity patterns of the true regression vector leading to unbounded Lasso risk -- or by the above equivalence to dense Lasso estimates. This precise phase transition was established by \citet{miolane2018distribution,celentano2020lasso} using fixed-point equations in an equivalent sequence model. An alternative proof of this phase transition is provided here using simple arguments without relying on the fixed-point equations or the equivalent sequence model. A modification of the well-known Restricted Eigenvalue argument allows to extend the analysis to any small tuning parameter of constant order, leading to a bounded risk on one side of the phase transition. On the other side of the phase transition, it is established the Lasso risk can be unbounded for a given sign pattern as soon as Basis Pursuit fails to recover that sign pattern in noiseless problems.
Paper Structure (6 sections, 5 theorems, 65 equations)

This paper contains 6 sections, 5 theorems, 65 equations.

Key Result

Theorem 1

Let assumption be fulfilled. For any $s_0\in(0,1)$, there exists $r_0'>0$ and an event $\Omega_n(s_0)$ with $\mathbb P(\Omega_n(s_0)) \to 1$ as $n,p\to+\infty$ while $\lambda,\sigma,\gamma,\kappa,s_0$ remain fixed such that, in $\Omega_n(s_0)$, For any $r_0\in(0,1)$, there exists $s_0'>0$ and an event $\Omega_n(s_0)$ with $\mathbb P(\Omega^n(r_0)) \to 1$ as $n,p\to+\infty$ while $\lambda,\sigma,\

Theorems & Definitions (10)

  • Theorem 1
  • proof : Proof of \ref{['theorem_main']}, implication \ref{['implies_bounded_risk']}
  • Lemma 1
  • proof : Proof of \ref{['theorem_main']}, implication \ref{['implies_bounded_sparsity']}
  • Theorem 2: miolane2018distribution for $\Sigma=I_p$, celentano2020lasso for $\Sigma\ne I_p$
  • Proposition 1
  • proof : Proof of \ref{['prop:bounded_risk']}
  • Proposition 2
  • proof : Proof of \ref{['prop_if_BP_fails']}
  • proof : Proof of \ref{['my_lemma']}