Learning to Reconstruct Signals From Binary Measurements

Julián Tachella; Laurent Jacques

Learning to Reconstruct Signals From Binary Measurements

Julián Tachella, Laurent Jacques

TL;DR

This work explores the extreme case of learning from binary observations and provides necessary and sufficient conditions on the number of measurements required for identifying a set of signals from incomplete binary data and introduces a novel self-supervised learning approach, which is named SSBM, that only requires binary data for training.

Abstract

Recent advances in unsupervised learning have highlighted the possibility of learning to reconstruct signals from noisy and incomplete linear measurements alone. These methods play a key role in medical and scientific imaging and sensing, where ground truth data is often scarce or difficult to obtain. However, in practice, measurements are not only noisy and incomplete but also quantized. Here we explore the extreme case of learning from binary observations and provide necessary and sufficient conditions on the number of measurements required for identifying a set of signals from incomplete binary data. Our results are complementary to existing bounds on signal recovery from binary measurements. Furthermore, we introduce a novel self-supervised learning approach, which we name SSBM, that only requires binary data for training. We demonstrate in a series of experiments with real datasets that SSBM performs on par with supervised learning and outperforms sparse reconstruction methods with a fixed wavelet basis by a large margin.

Learning to Reconstruct Signals From Binary Measurements

TL;DR

Abstract

Paper Structure (27 sections, 11 theorems, 85 equations, 12 figures, 2 tables)

This paper contains 27 sections, 11 theorems, 85 equations, 12 figures, 2 tables.

Introduction
Unsupervised learning in inverse problems.
Quantized and one-bit sensing.
One-bit matrix completion and dictionary learning.
Signal Recovery Preliminaries
Model Identification from Binary Observations
A Lower Bound on the Identification Error
Remark
A Sufficient Condition for Model Identification
Learning to Reconstruct
Sample complexity
Learning Algorithms
Analysis of the proposed loss
Model identification perspective
Experiments
...and 12 more sections

Key Result

Theorem 1

Let $A$ be a matrix with iid entries sampled from a standard Gaussian distribution and assume that $\operatorname{boxdim}\left(\mathcal{X}\right)<k$, such that $\mathfrak{N}(\mathcal{X},\epsilon)\leq \epsilon^{-k}$ for all $\epsilon<\epsilon_0$ with $\epsilon_0\in (0,\frac{1}{2})$.For $\delta\leq \m then for all $x,s\in \mathcal{X}$, we have that with probability greater than $1-\xi$.

Figures (12)

Figure 1: We propose a method for learning to reconstruct binary measurement observations, using only the binary observations themselves for training. The learned reconstruction function can discover unseen patterns in the data (in this case the clothes of fashionMNIST - see the experiments in \ref{['sec: experiments']}), which cannot be recognized in the standard linear reconstructions (no learning). We also provide theoretical bounds that characterize how well we can expect to learn the set of signals from binary measurement data alone.
Figure 2: Geometry of the 1-bit signal recovery problem with $m=5$ and $n=3$. Left: The binary sensing operator $\mathop{\mathrm{sign}}\limits\left(A\cdot\right)$ defines a tessellation of the sphere into multiple consistency cells, which are defined as all vectors $x\in \mathbb{S}^{2}$ associated with the same binary code. The consistency cell associated with a given measurement $y$ is shown in green. Each red line is a great circle defined by all points of $\mathbb{S}^{2}$ perpendicular to one row of $A$. Middle: If the signal set consists of all vectors in the sphere, i.e., $\mathcal{X} = \mathbb{S}^{2}$, the center of the cell is the optimal reconstruction $\hat{f}(y)$ (depicted with a blue cross) and the recovery error (denoted by $\delta$) is given by the radius of the cell. Right: If the signal set (depicted in black) occupies only a small subset of $\mathbb{S}^{2}$, i.e., it has a small box-counting dimension, the optimal reconstruction corresponds to the center of the intersection between the signal set and the consistency cell, and the resulting signal recovery error is smaller.
Figure 3: Illustration of the model identification problem from binary measurements with $n=3$, $m=4$, and $G=3$. A signal set with box-counting dimension 1 is depicted in black. The red lines define the frontiers of the consistency cells associated with operators $A_1,\dots, A_3$. From left to right: The signal set, the estimation of the signal set associated with $A_1,\dots,A_3$ and the overall estimate $\hat{\mathcal{X}}$.
Figure 4: Illustration of the oracle argument in the example of \ref{['fig:illustration']}. Left: The signal set $\mathcal{X}\subset \mathbb{S}^{2}$ is depicted in black. Middle: Cells intersected by the oracle system are indicated in green. Right: The identified set $\hat{\mathcal{X}}$ is indicated in green, and is larger than the oracle counterpart.
Figure 5: Evaluated training losses for enforcing sign measurement consistency $\mathop{\mathrm{sign}}\limits\left(A\hat{x}\right)=y$ of reconstructions $f_{\theta}(y)=\hat{x}$. Left: The loss functions are shown for the case $y=1$. Right: Average test PSNR of different measurement consistency losses on the MNIST dataset with $G=10$ operators.
...and 7 more figures

Theorems & Definitions (23)

Theorem 1
Definition 3.1: Model identification error
Proposition 2
proof
Proposition 3
proof
Corollary 4
Proposition 5
proof
Corollary 6
...and 13 more

Learning to Reconstruct Signals From Binary Measurements

TL;DR

Abstract

Learning to Reconstruct Signals From Binary Measurements

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (23)