Table of Contents
Fetching ...

Fast Debiasing of the LASSO Estimator

Shuvayan Banerjee, James Saunderson, Radhendushka Srivastava, Ajit Rajwade

TL;DR

This work tackles bias in the Lasso estimator for high-dimensional sparse regression and the computational bottleneck of debiasing via the approximate inverse $\boldsymbol{M}$. It introduces a re-parameterization to the debiasing weight matrix $\boldsymbol{W} = \boldsymbol{A}\boldsymbol{M}^{\top}$, and derives a simple closed-form, unique solution for $\boldsymbol{W}$ under mild conditions on $\mu$ and the sensing matrix $\boldsymbol{A}$. By focusing on the product $\boldsymbol{A}\boldsymbol{M}^{\top}$ rather than $\boldsymbol{M}$ itself, the approach preserves the asymptotic debiasing guarantees while eliminating iterative optimization. Empirical results show that the closed-form $\boldsymbol{W}_e$ achieves the same inference performance as the iterative $\boldsymbol{W}_o$ but with orders-of-magnitude faster computation, making it attractive for streaming high-dimensional inference. The method is particularly well-suited for ensembles of i.i.d. sub-Gaussian rows with diagonal covariance $\boldsymbol{\Sigma}$, enabling scalable and rapid debiasing in practice.

Abstract

In high-dimensional sparse regression, the \textsc{Lasso} estimator offers excellent theoretical guarantees but is well-known to produce biased estimates. To address this, \cite{Javanmard2014} introduced a method to ``debias" the \textsc{Lasso} estimates for a random sub-Gaussian sensing matrix $\boldsymbol{A}$. Their approach relies on computing an ``approximate inverse" $\boldsymbol{M}$ of the matrix $\boldsymbol{A}^\top \boldsymbol{A}/n$ by solving a convex optimization problem. This matrix $\boldsymbol{M}$ plays a critical role in mitigating bias and allowing for construction of confidence intervals using the debiased \textsc{Lasso} estimates. However the computation of $\boldsymbol{M}$ is expensive in practice as it requires iterative optimization. In the presented work, we re-parameterize the optimization problem to compute a ``debiasing matrix" $\boldsymbol{W} := \boldsymbol{AM}^{\top}$ directly, rather than the approximate inverse $\boldsymbol{M}$. This reformulation retains the theoretical guarantees of the debiased \textsc{Lasso} estimates, as they depend on the \emph{product} $\boldsymbol{AM}^{\top}$ rather than on $\boldsymbol{M}$ alone. Notably, we provide a simple, computationally efficient, closed-form solution for $\boldsymbol{W}$ under similar conditions for the sensing matrix $\boldsymbol{A}$ used in the original debiasing formulation, with an additional condition that the elements of every row of $\boldsymbol{A}$ have uncorrelated entries. Also, the optimization problem based on $\boldsymbol{W}$ guarantees a unique optimal solution, unlike the original formulation based on $\boldsymbol{M}$. We verify our main result with numerical simulations.

Fast Debiasing of the LASSO Estimator

TL;DR

This work tackles bias in the Lasso estimator for high-dimensional sparse regression and the computational bottleneck of debiasing via the approximate inverse . It introduces a re-parameterization to the debiasing weight matrix , and derives a simple closed-form, unique solution for under mild conditions on and the sensing matrix . By focusing on the product rather than itself, the approach preserves the asymptotic debiasing guarantees while eliminating iterative optimization. Empirical results show that the closed-form achieves the same inference performance as the iterative but with orders-of-magnitude faster computation, making it attractive for streaming high-dimensional inference. The method is particularly well-suited for ensembles of i.i.d. sub-Gaussian rows with diagonal covariance , enabling scalable and rapid debiasing in practice.

Abstract

In high-dimensional sparse regression, the \textsc{Lasso} estimator offers excellent theoretical guarantees but is well-known to produce biased estimates. To address this, \cite{Javanmard2014} introduced a method to ``debias" the \textsc{Lasso} estimates for a random sub-Gaussian sensing matrix . Their approach relies on computing an ``approximate inverse" of the matrix by solving a convex optimization problem. This matrix plays a critical role in mitigating bias and allowing for construction of confidence intervals using the debiased \textsc{Lasso} estimates. However the computation of is expensive in practice as it requires iterative optimization. In the presented work, we re-parameterize the optimization problem to compute a ``debiasing matrix" directly, rather than the approximate inverse . This reformulation retains the theoretical guarantees of the debiased \textsc{Lasso} estimates, as they depend on the \emph{product} rather than on alone. Notably, we provide a simple, computationally efficient, closed-form solution for under similar conditions for the sensing matrix used in the original debiasing formulation, with an additional condition that the elements of every row of have uncorrelated entries. Also, the optimization problem based on guarantees a unique optimal solution, unlike the original formulation based on . We verify our main result with numerical simulations.

Paper Structure

This paper contains 29 sections, 5 theorems, 33 equations, 1 figure, 1 table, 1 algorithm.

Key Result

Theorem 1

Let $\boldsymbol{A}$ be a $n\times p$ matrix with no column equal to zero. Define $\rho(\boldsymbol{A}) := \max_{i\neq j} \frac{|\boldsymbol{a}_{.i}^\top \boldsymbol{a}_{.j}|}{\|\boldsymbol{a}_{.j}\|_2^2}$. The optimal solution of eq:opt_W_prim is given by if and only if $\frac{\rho}{1+\rho} \leq \mu \leq 1$.

Figures (1)

  • Figure 1: Line plot of $\mu$ vs relative error $\left(\frac{\|\boldsymbol{W}_o-\boldsymbol{W}_e\|_F}{\|\boldsymbol{W_e}\|_F}\right)$ (in log scale) for two $80 \times 100$ dimensional sensing matrices: (left) i.i.d. Gaussian and (right) i.i.d. Rademacher. The exact value of $\frac{\rho}{1+\rho}$ is given by the black vertical line. The value of $\frac{\rho}{1+\rho}$ is $0.327$ for the Gaussian sensing matrix (left) and $0.298$ for the Rademacher sensing matrix (right). Here, $\boldsymbol{W_o}$ is the solution of the optimization problem in \ref{['eq:opt_W_prim']} and $\boldsymbol{W}_e$ is computed as in \ref{['eq:exact']}.

Theorems & Definitions (5)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Lemma 4
  • Lemma 5