Straight-Through meets Sparse Recovery: the Support Exploration Algorithm

Mimoun Mohamed; François Malgouyres; Valentin Emiya; Caroline Chaux

Straight-Through meets Sparse Recovery: the Support Exploration Algorithm

Mimoun Mohamed, François Malgouyres, Valentin Emiya, Caroline Chaux

TL;DR

This work repurposes the straight-through estimator (STE) from quantized neural networks to the sparse support recovery problem, formulating a sparsification-based objective $F(H(\mathcal{X}))$ and introducing the Support Exploration Algorithm (SEA). SEA maintains a dense exploration vector $\mathcal{X}$, selects a $k$-sparse support via $S^t=\text{largest}_k(\mathcal{X}^t)$, and updates $\mathcal{X}$ through an STE-inspired gradient, enabling broader exploration of candidate supports than traditional greedy methods. The authors establish RIP-based recovery guarantees (Recovery-RIP and related corollaries) showing that SEA can recover the true support under certain incoherence/noise conditions, and they demonstrate substantial empirical gains in coherent settings (e.g., spike deconvolution) where standard methods falter. The results highlight SEA’s potential as both a standalone sparse-recovery method and a post-processing step to improve existing solvers, with practical implications for real-world inverse problems and potential extensions to neural-network sparsification contexts.

Abstract

The {\it straight-through estimator} (STE) is commonly used to optimize quantized neural networks, yet its contexts of effective performance are still unclear despite empirical successes.To make a step forward in this comprehension, we apply STE to a well-understood problem: {\it sparse support recovery}. We introduce the {\it Support Exploration Algorithm} (SEA), a novel algorithm promoting sparsity, and we analyze its performance in support recovery (a.k.a. model selection) problems. SEA explores more supports than the state-of-the-art, leading to superior performance in experiments, especially when the columns of $A$ are strongly coherent.The theoretical analysis considers recovery guarantees when the linear measurements matrix $A$ satisfies the {\it Restricted Isometry Property} (RIP).The sufficient conditions of recovery are comparable but more stringent than those of the state-of-the-art in sparse support recovery. Their significance lies mainly in their applicability to an instance of the STE.

Straight-Through meets Sparse Recovery: the Support Exploration Algorithm

TL;DR

This work repurposes the straight-through estimator (STE) from quantized neural networks to the sparse support recovery problem, formulating a sparsification-based objective

and introducing the Support Exploration Algorithm (SEA). SEA maintains a dense exploration vector

, selects a

-sparse support via

, and updates

through an STE-inspired gradient, enabling broader exploration of candidate supports than traditional greedy methods. The authors establish RIP-based recovery guarantees (Recovery-RIP and related corollaries) showing that SEA can recover the true support under certain incoherence/noise conditions, and they demonstrate substantial empirical gains in coherent settings (e.g., spike deconvolution) where standard methods falter. The results highlight SEA’s potential as both a standalone sparse-recovery method and a post-processing step to improve existing solvers, with practical implications for real-world inverse problems and potential extensions to neural-network sparsification contexts.

Abstract

are strongly coherent.The theoretical analysis considers recovery guarantees when the linear measurements matrix

satisfies the {\it Restricted Isometry Property} (RIP).The sufficient conditions of recovery are comparable but more stringent than those of the state-of-the-art in sparse support recovery. Their significance lies mainly in their applicability to an instance of the STE.

Paper Structure (59 sections, 14 theorems, 103 equations, 36 figures, 6 algorithms)

This paper contains 59 sections, 14 theorems, 103 equations, 36 figures, 6 algorithms.

Introduction
Straight-through estimator.
Sparse support recovery.
Proposed STE-based approach for sparse recovery.
Contributions.
Organization of the article.
Related Works
On the STE.
On sparse prior and support recovery.
Support recovery models and algorithms.
Position of the article.
Method
Notations
The Support Exploration Algorithm
Computational Complexity
...and 44 more sections

Key Result

Theorem 4.1

Assume $A$ satisfies the $(2k+1)$-RIP andThe normalization aims at simplifying formulas by guaranteeing that $\delta_1=0$. It is done at no expense since, if $A$ is not normalized but satisfies eq:RIP for $l>1$, its normalization only has a small impact on $\delta_l$. Indeed, considering $\Delta\in\ If moreover, $x^*$ is such that and SEA performs more than $T_{ RIP}$ iterations, then $S^* \subse

Figures (36)

Figure 1: Overview of the main results. Left: phase transition diagram showing the recovery limits in dimension $n=500$ while sparsity $k$ and number of observations $m$ varies (the higher, the better, see details in Section \ref{['dt-sec']}). Right: spike deconvolution in dimension $m=n=500$ - Average distance between the supports of the solution $x^*$ and the estimations obtained from various algorithms, plotted against the sparsity level $k$ (the lower, the better, see details in Section \ref{['deconv-sec']}).
Figure 2: Phase transition diagram: each curve is the threshold below which the related algorithm recovers at least $95\%$ of the supports. $\zeta$ denotes the ratio between the number of rows and the number of columns in $A$ while $\rho$ denotes the ratio between the sparsity and the number of rows in $A$. Matrix $A$ have i.i.d. standard Gaussian entries and non-zero entries in $x^*$ are drawn uniformly in $[-2, -1]\cup[1, 2]$. $n=500$ is fixed and results are obtained from $1000$ runs.
Figure 3: Spike deconvolution: representation of an instance of $x^*$ and $y$ with the solutions provided by the algorithms when $k = 20$. This is a cropped version of a crowded area (spikes are close).
Figure 4: Spike deconvolution: average support distance between $S^*$ and the support of the solutions provided by several algorithms as a function of the sparsity level $k$.
Figure 5: Visual representation of the main sets of indices encountered in the article.
...and 31 more figures

Theorems & Definitions (23)

Theorem 4.1: Recovery - RIP case
Corollary 4.2: Noiseless recovery - simplified RIP case
Proposition A.1: Optimization problem equivalence
proof
Theorem C.1: Recovery - Oracle Update Rule
Lemma C.2
proof
Lemma C.3
proof
Theorem C.4: Recovery - General case
...and 13 more

Straight-Through meets Sparse Recovery: the Support Exploration Algorithm

TL;DR

Abstract

Straight-Through meets Sparse Recovery: the Support Exploration Algorithm

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (36)

Theorems & Definitions (23)