Sparse Linear Regression and Lattice Problems

Aparna Gupte; Neekon Vafa; Vinod Vaikuntanathan

Sparse Linear Regression and Lattice Problems

Aparna Gupte, Neekon Vafa, Vinod Vaikuntanathan

TL;DR

The paper establishes average-case hardness results for sparse linear regression (SLR) by constructing reductions from lattice-based problems, notably BinaryBDD, to $k$-SLR with Gaussian-like design matrices, thereby linking SLR performance to the conditioning of lattice bases via the restricted eigenvalue parameter. It also shows hardness in Gaussian-design regimes under the CLWE assumption, even when the regression matrix is nearly spherical, by embedding a fixed low-rank component alongside random Gaussian rows. On the algorithmic side, the authors analyze how Lasso behaves on the constructed instances, deriving RE-based lower bounds and predicting error scales that reveal a computational-statistical gap between information-theoretic limits and what efficiently computable estimators can achieve in these hard regimes. The work further discusses open questions about robust average-case hard distributions for SLR, the potential role of lattice basis reduction in improving SLR algorithms, and connections to concurrent hardness results in related cryptographic and statistical problems. Overall, the results provide a foundational hardness picture that complements algorithmic advances and motivates careful design choices in high-dimensional sparse regression, especially when poorly conditioned or lattice-informed covariances arise.

Abstract

Sparse linear regression (SLR) is a well-studied problem in statistics where one is given a design matrix $X\in\mathbb{R}^{m\times n}$ and a response vector $y=Xθ^*+w$ for a $k$-sparse vector $θ^*$ (that is, $\|θ^*\|_0\leq k$) and small, arbitrary noise $w$, and the goal is to find a $k$-sparse $\widehatθ \in \mathbb{R}^n$ that minimizes the mean squared prediction error $\frac{1}{m}\|X\widehatθ-Xθ^*\|^2_2$. While $\ell_1$-relaxation methods such as basis pursuit, Lasso, and the Dantzig selector solve SLR when the design matrix is well-conditioned, no general algorithm is known, nor is there any formal evidence of hardness in an average-case setting with respect to all efficient algorithms. We give evidence of average-case hardness of SLR w.r.t. all efficient algorithms assuming the worst-case hardness of lattice problems. Specifically, we give an instance-by-instance reduction from a variant of the bounded distance decoding (BDD) problem on lattices to SLR, where the condition number of the lattice basis that defines the BDD instance is directly related to the restricted eigenvalue condition of the design matrix, which characterizes some of the classical statistical-computational gaps for sparse linear regression. Also, by appealing to worst-case to average-case reductions from the world of lattices, this shows hardness for a distribution of SLR instances; while the design matrices are ill-conditioned, the resulting SLR instances are in the identifiable regime. Furthermore, for well-conditioned (essentially) isotropic Gaussian design matrices, where Lasso is known to behave well in the identifiable regime, we show hardness of outputting any good solution in the unidentifiable regime where there are many solutions, assuming the worst-case hardness of standard and well-studied lattice problems.

Sparse Linear Regression and Lattice Problems

TL;DR

The paper establishes average-case hardness results for sparse linear regression (SLR) by constructing reductions from lattice-based problems, notably BinaryBDD, to

-SLR with Gaussian-like design matrices, thereby linking SLR performance to the conditioning of lattice bases via the restricted eigenvalue parameter. It also shows hardness in Gaussian-design regimes under the CLWE assumption, even when the regression matrix is nearly spherical, by embedding a fixed low-rank component alongside random Gaussian rows. On the algorithmic side, the authors analyze how Lasso behaves on the constructed instances, deriving RE-based lower bounds and predicting error scales that reveal a computational-statistical gap between information-theoretic limits and what efficiently computable estimators can achieve in these hard regimes. The work further discusses open questions about robust average-case hard distributions for SLR, the potential role of lattice basis reduction in improving SLR algorithms, and connections to concurrent hardness results in related cryptographic and statistical problems. Overall, the results provide a foundational hardness picture that complements algorithmic advances and motivates careful design choices in high-dimensional sparse regression, especially when poorly conditioned or lattice-informed covariances arise.

Abstract

Sparse linear regression (SLR) is a well-studied problem in statistics where one is given a design matrix

and a response vector

for a

-sparse vector

(that is,

) and small, arbitrary noise

, and the goal is to find a

-sparse

that minimizes the mean squared prediction error

. While

-relaxation methods such as basis pursuit, Lasso, and the Dantzig selector solve SLR when the design matrix is well-conditioned, no general algorithm is known, nor is there any formal evidence of hardness in an average-case setting with respect to all efficient algorithms. We give evidence of average-case hardness of SLR w.r.t. all efficient algorithms assuming the worst-case hardness of lattice problems. Specifically, we give an instance-by-instance reduction from a variant of the bounded distance decoding (BDD) problem on lattices to SLR, where the condition number of the lattice basis that defines the BDD instance is directly related to the restricted eigenvalue condition of the design matrix, which characterizes some of the classical statistical-computational gaps for sparse linear regression. Also, by appealing to worst-case to average-case reductions from the world of lattices, this shows hardness for a distribution of SLR instances; while the design matrices are ill-conditioned, the resulting SLR instances are in the identifiable regime. Furthermore, for well-conditioned (essentially) isotropic Gaussian design matrices, where Lasso is known to behave well in the identifiable regime, we show hardness of outputting any good solution in the unidentifiable regime where there are many solutions, assuming the worst-case hardness of standard and well-studied lattice problems.

Paper Structure (28 sections, 23 theorems, 171 equations)

This paper contains 28 sections, 23 theorems, 171 equations.

Introduction
Our Results
Binary Bounded Distance Decoding.
Our First Result.
Our Second Result.
Perspectives and Open Problems
Concurrent work.
Road Map of Main Results
Preliminaries
Bounded Distance Decoding
Sparse Linear Regression
Reduction from Bounded Distance Decoding
Interpretations
Proof of Theorem \ref{['thm:main-bin-bdd-to-slr']}
Performance of Lasso on Our k-SLR Instances
...and 13 more sections

Key Result

Theorem 1

There is a $\mathop{\mathrm{poly}}\nolimits(m, k \cdot 2^{d/k})$-time randomized reduction from $\mathsf{BinaryBDD}$ in $d$ dimensions with parameter $\alpha \leq 1/10$ to $k$-$\mathsf{SLR}$ in dimension $n = k \cdot 2^{d/k}$ and $m \geq 17 d$ samples that succeeds with probability $1 - e^{-\Omega(m

Theorems & Definitions (51)

Theorem 1
Definition 1
Lemma 1
proof
Definition 2: $\mathsf{BinaryBDD}_{d, \alpha}$
Lemma 2: As in rudelson2010non
Lemma 3: As in laurent2000adaptive, Corollary of Lemma 1
Definition 3: $k$-$\mathsf{SLR}$
Definition 4: As in negahban2012unified
Lemma 4
...and 41 more

Sparse Linear Regression and Lattice Problems

TL;DR

Abstract

Sparse Linear Regression and Lattice Problems

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (51)