Table of Contents
Fetching ...

Algorithm Unfolding for Block-sparse and MMV Problems with Reduced Training Overhead

Jan Christian Hauffen, Peter Jung, Nicole Mücke

TL;DR

This study considers algorithm unfolding for the multiple measurement vector (MMV) problem in the case where only few training samples are available and proposes a reduced-size network architecture based on the Kronecker structure imposed by the MMV observation model.

Abstract

In this paper we consider algorithm unfolding for the Multiple Measurement Vector (MMV) problem in the case where only few training samples are available. Algorithm unfolding has been shown to empirically speed-up in a data-driven way the convergence of various classical iterative algorithms but for supervised learning it is important to achieve this with minimal training data. For this we consider learned block iterative shrinkage thresholding algorithm (LBISTA) under different training strategies. To approach almost data-free optimization at minimal training overhead the number of trainable parameters for algorithm unfolding has to be substantially reduced. We therefore explicitly propose a reduced-size network architecture based on the Kronecker structure imposed by the MMV observation model and present the corresponding theory in this context. To ensure proper generalization, we then extend the analytic weight approach by Lui et al to LBISTA and the MMV setting. Rigorous theoretical guarantees and convergence results are stated for this case. We show that the network weights can be computed by solving an explicit equation at the reduced MMV dimensions which also admits a closed-form solution. Towards more practical problems, we then consider convolutional observation models and show that the proposed architecture and the analytical weight computation can be further simplified and thus open new directions for convolutional neural networks. Finally, we evaluate the unfolded algorithms in numerical experiments and discuss connections to other sparse recovering algorithms.

Algorithm Unfolding for Block-sparse and MMV Problems with Reduced Training Overhead

TL;DR

This study considers algorithm unfolding for the multiple measurement vector (MMV) problem in the case where only few training samples are available and proposes a reduced-size network architecture based on the Kronecker structure imposed by the MMV observation model.

Abstract

In this paper we consider algorithm unfolding for the Multiple Measurement Vector (MMV) problem in the case where only few training samples are available. Algorithm unfolding has been shown to empirically speed-up in a data-driven way the convergence of various classical iterative algorithms but for supervised learning it is important to achieve this with minimal training data. For this we consider learned block iterative shrinkage thresholding algorithm (LBISTA) under different training strategies. To approach almost data-free optimization at minimal training overhead the number of trainable parameters for algorithm unfolding has to be substantially reduced. We therefore explicitly propose a reduced-size network architecture based on the Kronecker structure imposed by the MMV observation model and present the corresponding theory in this context. To ensure proper generalization, we then extend the analytic weight approach by Lui et al to LBISTA and the MMV setting. Rigorous theoretical guarantees and convergence results are stated for this case. We show that the network weights can be computed by solving an explicit equation at the reduced MMV dimensions which also admits a closed-form solution. Towards more practical problems, we then consider convolutional observation models and show that the proposed architecture and the analytical weight computation can be further simplified and thus open new directions for convolutional neural networks. Finally, we evaluate the unfolded algorithms in numerical experiments and discuss connections to other sparse recovering algorithms.
Paper Structure (27 sections, 5 theorems, 110 equations, 3 figures, 1 table, 1 algorithm)

This paper contains 27 sections, 5 theorems, 110 equations, 3 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

For any $B\in\mathcal{W}_b(D)$ and any sequence $\gamma^{(k)}\in \left(0, \frac{2}{\mu(2 s-1)+1}\right)$ and parameters $\left( B^{(k)}:=B, \alpha^{(k)},\gamma^{(k)}\right)$, for $k\leq K$, with: for some $\kappa\geq1$, with $\mu=d\tilde{\mu}_b(D)$. With $M>0$ and $s<(\mu^{-1}+1)/2$ we have: where

Figures (3)

  • Figure 1: NMSE in dB over layers / iterations, pnz = 10%, without noise. BISTA and FastBISTA are evaluated with $\alpha = 1$ and $\gamma = 1 / (1.01\|D\|^2)$.
  • Figure 2: Plots justifying results in Theorem \ref{['UpperBound']} for the results on the problem with circular convolution matrix in Figure \ref{['fig:justification']} and with Gaussian measurement matrix ion \ref{['fig:justification_gam']}.
  • Figure 3: Training history for results shown in Figure \ref{['fig:Gaussian_NMSE_vs_Iter']} respectively \ref{['fig:Conv_NMSE_vs_Iter']}. One can see that ALBISTA needs less training iterations than LBISTA CP (untied) or LBISTA (untied).

Theorems & Definitions (16)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Theorem 1
  • Definition 5
  • Definition 6
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • ...and 6 more