Table of Contents
Fetching ...

List-Decodable Regression via Expander Sketching

Herbod Pourali, Sajjad Hashemian, Ebrahim Ardeshir-Larijani

TL;DR

This paper tackles robust list-decodable linear regression when a large fraction of samples may be adversarially corrupted. It introduces an expander-sketching pipeline that synthesizes lightly contaminated buckets via lossless expanders, followed by robust aggregation and spectral filtering to recover the regression direction. The method achieves near-optimal rates with sample complexity tilde-O((d+log(1/delta))/alpha), list size O(1/alpha), and near input-sparsity runtime tilde-O(nnz(X)+d^3/alpha), without relying on explicit batch structure or SoS techniques. Theoretical analysis ties isolation properties of expanders to moment concentration and perturbation bounds, and experiments demonstrate strong robustness across varying contamination and a real-data stress test. Overall, the work shows how combinatorial sketching can bypass SQ barriers and enable efficient, resistant learning in adversarial settings with practical impact.

Abstract

We introduce an expander-sketching framework for list-decodable linear regression that achieves sample complexity $\tilde{O}((d+\log(1/δ))/α)$, list size $O(1/α)$, and near input-sparsity running time $\tilde{O}(\mathrm{nnz}(X)+d^{3}/α)$ under standard sub-Gaussian assumptions. Our method uses lossless expanders to synthesize lightly contaminated batches, enabling robust aggregation and a short spectral filtering stage that matches the best known efficient guarantees while avoiding SoS machinery and explicit batch structure.

List-Decodable Regression via Expander Sketching

TL;DR

This paper tackles robust list-decodable linear regression when a large fraction of samples may be adversarially corrupted. It introduces an expander-sketching pipeline that synthesizes lightly contaminated buckets via lossless expanders, followed by robust aggregation and spectral filtering to recover the regression direction. The method achieves near-optimal rates with sample complexity tilde-O((d+log(1/delta))/alpha), list size O(1/alpha), and near input-sparsity runtime tilde-O(nnz(X)+d^3/alpha), without relying on explicit batch structure or SoS techniques. Theoretical analysis ties isolation properties of expanders to moment concentration and perturbation bounds, and experiments demonstrate strong robustness across varying contamination and a real-data stress test. Overall, the work shows how combinatorial sketching can bypass SQ barriers and enable efficient, resistant learning in adversarial settings with practical impact.

Abstract

We introduce an expander-sketching framework for list-decodable linear regression that achieves sample complexity , list size , and near input-sparsity running time under standard sub-Gaussian assumptions. Our method uses lossless expanders to synthesize lightly contaminated batches, enabling robust aggregation and a short spectral filtering stage that matches the best known efficient guarantees while avoiding SoS machinery and explicit batch structure.

Paper Structure

This paper contains 9 sections, 17 theorems, 96 equations, 8 figures, 8 tables, 1 algorithm.

Key Result

theorem thmcountertheorem

Given $n \gtrsim (d + \log(1/\delta))/\alpha$ samples and using expander-batching, one can give a list-decodable regression model that runs in time $\widetilde{O}(\mathrm{nnz}(X) + d^3/\alpha)$ and with probability at least $1-\delta$ outputs a list $L$ of size $O(1/\alpha)$ containing some $\hat{\e

Figures (8)

  • Figure 1: Projection $x_i^\top w^\star$ versus observed response $y_i$ at inlier fraction $\alpha=0.3$. Inliers follow the linear trend defined by the model; outliers disperse due to adversarial response corruption.
  • Figure 2: Test MSE of Expander-L over a fine grid of inlier fractions $\alpha$. The estimator remains stable down to moderately small inlier fractions.
  • Figure 3: Fine-grained dependence of Expander-L on outlier magnitude $S$.
  • Figure 4: Parameter error $\|\hat{w} - w^\star\|_2$ of Expander-L as a function of inlier fraction $\alpha$ under uniform response outliers ($n=5000$, $d=20$, $S=10$). Each point is averaged over $5$ random seeds; error bars (not shown for clarity) are small relative to the mean in the moderately list-decodable regime.
  • Figure 5: Parameter error $\|\hat{w} - w^\star\|_2$ of Expander-L as a function of outlier magnitude $S$ at fixed inlier fraction $\alpha=0.3$ ($n=5000$, $d=20$, uniform response outliers). Each point is averaged over $5$ random seeds, illustrating that the parameter error grows only gradually as the corruption level increases.
  • ...and 3 more figures

Theorems & Definitions (33)

  • theorem thmcountertheorem
  • lemma thmcounterlemma
  • proof
  • lemma thmcounterlemma
  • proof
  • lemma thmcounterlemma
  • proof
  • lemma thmcounterlemma
  • proof
  • lemma thmcounterlemma
  • ...and 23 more