Table of Contents
Fetching ...

Iterative Regularization with k-support Norm: An Important Complement to Sparse Recovery

William de Vazelhes, Bhaskar Mukhoty, Xiao-Tong Yuan, Bin Gu

TL;DR

A novel iterative regularization algorithm, IRKSN, based on the k-support norm regularizer rather than the l1 norm is proposed, achieving the standard linear rate for sparse recovery.

Abstract

Sparse recovery is ubiquitous in machine learning and signal processing. Due to the NP-hard nature of sparse recovery, existing methods are known to suffer either from restrictive (or even unknown) applicability conditions, or high computational cost. Recently, iterative regularization methods have emerged as a promising fast approach because they can achieve sparse recovery in one pass through early stopping, rather than the tedious grid-search used in the traditional methods. However, most of those iterative methods are based on the $\ell_1$ norm which requires restrictive applicability conditions and could fail in many cases. Therefore, achieving sparse recovery with iterative regularization methods under a wider range of conditions has yet to be further explored. To address this issue, we propose a novel iterative regularization algorithm, IRKSN, based on the $k$-support norm regularizer rather than the $\ell_1$ norm. We provide conditions for sparse recovery with IRKSN, and compare them with traditional conditions for recovery with $\ell_1$ norm regularizers. Additionally, we give an early stopping bound on the model error of IRKSN with explicit constants, achieving the standard linear rate for sparse recovery. Finally, we illustrate the applicability of our algorithm on several experiments, including a support recovery experiment with a correlated design matrix.

Iterative Regularization with k-support Norm: An Important Complement to Sparse Recovery

TL;DR

A novel iterative regularization algorithm, IRKSN, based on the k-support norm regularizer rather than the l1 norm is proposed, achieving the standard linear rate for sparse recovery.

Abstract

Sparse recovery is ubiquitous in machine learning and signal processing. Due to the NP-hard nature of sparse recovery, existing methods are known to suffer either from restrictive (or even unknown) applicability conditions, or high computational cost. Recently, iterative regularization methods have emerged as a promising fast approach because they can achieve sparse recovery in one pass through early stopping, rather than the tedious grid-search used in the traditional methods. However, most of those iterative methods are based on the norm which requires restrictive applicability conditions and could fail in many cases. Therefore, achieving sparse recovery with iterative regularization methods under a wider range of conditions has yet to be further explored. To address this issue, we propose a novel iterative regularization algorithm, IRKSN, based on the -support norm regularizer rather than the norm. We provide conditions for sparse recovery with IRKSN, and compare them with traditional conditions for recovery with norm regularizers. Additionally, we give an early stopping bound on the model error of IRKSN with explicit constants, achieving the standard linear rate for sparse recovery. Finally, we illustrate the applicability of our algorithm on several experiments, including a support recovery experiment with a correlated design matrix.
Paper Structure (37 sections, 10 theorems, 58 equations, 7 figures, 5 tables, 2 algorithms)

This paper contains 37 sections, 10 theorems, 58 equations, 7 figures, 5 tables, 2 algorithms.

Key Result

Theorem 6

Let $\delta\in\left]0,1\right]$ and let $(\hat{\bm{w}}_t)_{t\in\mathbb{N}}$ be the sequence generated by IRKSN. Assuming the design matrix $\bm{X}$ and the true sparse vector $\bm{w}^*$ satisfy Assumptions ass:sol and ass:ass, and with $\alpha < \frac{\eta }{\|\bm{w}\|_{\infty}}$ with $\eta := \min In particular (if $\delta > 0$), with $t_{\delta} = \lceil c \delta^{-1/2} \rceil$, for some $c > 0

Figures (7)

  • Figure 1: Conditions for recovery in various settings: l1SC corresponds to the condition $\max_{\ell \in \bar{S}}|\langle\bm{X}_S^{\dagger} \bm{x}_{\ell}, \operatorname{sgn}(\bm{w}^*_S)\rangle|<1$. "ours" denotes the condition $\max_{i \in \bar{S}}| \langle \bm{X}_S^{\dagger} \bm{x}_{i}, \bm{w}^*_{S}\rangle| < \min_{j \in S}| \langle \bm{X}_S^{\dagger} \bm{x}_{j}, \bm{w}^*_{S}\rangle|$. $c$ denotes some constant in $[0, 1]$. Here $3k$-RIP is shown for indicative purposes, corresponding to the condition for IHT as described in blumensath2009. As we can see, for some cases (in blue), only IRKSN (our algorithm) can provably ensure sparse recovery.
  • Figure 2: $X^{(3)}$, $X^{(4)}$ are correlated with $X^{(0)}, X^{(1)}$, $X^{(2)}$
  • Figure 3: Comparison of the path of IRKSN with Lasso. $w^{(y)}_i$ is the $i$-th component of $\bm{w}^{(y)}$, and $\lambda$ is the penalty of the Lasso. We recall $w^{(y)}_0=w^{(y)}_1=1, w^{(y)}_2=-4, w^{(y)}_3=w^{(y)}_4=0$: only IRKSN recovers the true $\bm{w}^{(y)}$.
  • Figure 4: Error and sparsity vs. number of iterations. Only IRKSN can recover the true $\bm{w}^{(y)}$ in this example.
  • Figure 5: F1-score of support recovery in various settings
  • ...and 2 more figures

Theorems & Definitions (27)

  • Definition 1: argyriou2012sparsemcdonald2014spectral
  • Definition 2: Proximal operator, parikh2014proximal
  • Theorem 6: Early Stopping Bound
  • proof
  • Definition A.1: Legendre-Fenchel dual Rockafellar70
  • Definition A.2: hard-thresholding operator blumensath2009
  • Remark A.3
  • Example A.4
  • Definition A.5: top-$k$ norm
  • Lemma B.2
  • ...and 17 more