Table of Contents
Fetching ...

Deep greedy unfolding: Sorting out argsorting in greedy sparse recovery algorithms

Sina Mohammad-Taheri, Matthew J. Colbrook, Simone Brugiapaglia

TL;DR

This work tackles the non-differentiability of the argsort operator in greedy sparse recovery algorithms by introducing differentiable Soft-OMP and Soft-IHT via softsort, enabling gradient-based training through algorithm unrolling. The authors establish theoretical guarantees that Soft-OMP/Soft-IHT approximate their non-differentiable counterparts with error controlled by a temperature parameter $\tau$, and they develop OMP-Net and IHT-Net to learn structure-aware sparsity patterns. Empirically, Soft-OMP/Soft-IHT approximate the original methods for sufficiently small $\tau$, and the trained greedy networks outperform classical OMP/IHT in heavily undersampled regimes, demonstrating practical impact for structured sparse recovery. The framework connects model-based recovery with data-driven learning, offering a path to refined latent-structure extraction and extensible extensions to other greedy algorithms and architectures.

Abstract

Gradient-based learning imposes (deep) neural networks to be differentiable at all steps. This includes model-based architectures constructed by unrolling iterations of an iterative algorithm onto layers of a neural network, known as algorithm unrolling. However, greedy sparse recovery algorithms depend on the non-differentiable argsort operator, which hinders their integration into neural networks. In this paper, we address this challenge in Orthogonal Matching Pursuit (OMP) and Iterative Hard Thresholding (IHT), two popular representative algorithms in this class. We propose permutation-based variants of these algorithms and approximate permutation matrices using "soft" permutation matrices derived from softsort, a continuous relaxation of argsort. We demonstrate -- both theoretically and numerically -- that Soft-OMP and Soft-IHT, as differentiable counterparts of OMP and IHT and fully compatible with neural network training, effectively approximate these algorithms with a controllable degree of accuracy. This leads to the development of OMP- and IHT-Net, fully trainable network architectures based on Soft-OMP and Soft-IHT, respectively. Finally, by choosing weights as "structure-aware" trainable parameters, we connect our approach to structured sparse recovery and demonstrate its ability to extract latent sparsity patterns from data.

Deep greedy unfolding: Sorting out argsorting in greedy sparse recovery algorithms

TL;DR

This work tackles the non-differentiability of the argsort operator in greedy sparse recovery algorithms by introducing differentiable Soft-OMP and Soft-IHT via softsort, enabling gradient-based training through algorithm unrolling. The authors establish theoretical guarantees that Soft-OMP/Soft-IHT approximate their non-differentiable counterparts with error controlled by a temperature parameter , and they develop OMP-Net and IHT-Net to learn structure-aware sparsity patterns. Empirically, Soft-OMP/Soft-IHT approximate the original methods for sufficiently small , and the trained greedy networks outperform classical OMP/IHT in heavily undersampled regimes, demonstrating practical impact for structured sparse recovery. The framework connects model-based recovery with data-driven learning, offering a path to refined latent-structure extraction and extensible extensions to other greedy algorithms and architectures.

Abstract

Gradient-based learning imposes (deep) neural networks to be differentiable at all steps. This includes model-based architectures constructed by unrolling iterations of an iterative algorithm onto layers of a neural network, known as algorithm unrolling. However, greedy sparse recovery algorithms depend on the non-differentiable argsort operator, which hinders their integration into neural networks. In this paper, we address this challenge in Orthogonal Matching Pursuit (OMP) and Iterative Hard Thresholding (IHT), two popular representative algorithms in this class. We propose permutation-based variants of these algorithms and approximate permutation matrices using "soft" permutation matrices derived from softsort, a continuous relaxation of argsort. We demonstrate -- both theoretically and numerically -- that Soft-OMP and Soft-IHT, as differentiable counterparts of OMP and IHT and fully compatible with neural network training, effectively approximate these algorithms with a controllable degree of accuracy. This leads to the development of OMP- and IHT-Net, fully trainable network architectures based on Soft-OMP and Soft-IHT, respectively. Finally, by choosing weights as "structure-aware" trainable parameters, we connect our approach to structured sparse recovery and demonstrate its ability to extract latent sparsity patterns from data.

Paper Structure

This paper contains 29 sections, 8 theorems, 69 equations, 3 figures, 6 algorithms.

Key Result

Proposition 3.2

\newlabelprop:properties0 Softsort satisfies:

Figures (3)

  • Figure 1: First element of the operators sort, softsort, argsort and softargsort applied to the two-dimensional vector $v = (v_1, v_2)$, as defined in \ref{['eq:sorting', 'eq:permutation', 'eq:softsort']}, shown as a function of $v_1$ for fixed $v_2 = 1$ ($\tau = 0.25$).
  • Figure 1: Relative $\ell^2$-error as a function of $\tau$ (see \ref{['subsec:experiment_i']}). Recovery accuracy of (Soft-)OMP on the left and (Soft-)IHT on the right for various iteration counts.
  • Figure 2: From top to bottom: MSE-Loss, oracle weights, learned weights and relative $\ell^2$-error boxplots for (Soft-)OMP on the left and (Soft-)IHT on the right column (see \ref{['subsec:experiment_ii']} for futher details).

Theorems & Definitions (22)

  • Remark 2.1: Algorithmic differentiability and subgradients
  • Example 3.1
  • Proposition 3.2: Properties of the softsort operator prillo2020softsort
  • Proposition 3.3: Lipschitzness of softsort
  • Remark 3.4
  • Remark 3.5
  • Theorem 3.6: Soft-OMP is a good approximation to OMP
  • Proof 1: Proof (sketch)
  • Remark 3.7
  • Remark 3.8
  • ...and 12 more