Binary Iterative Hard Thresholding Converges with Optimal Number of Measurements for 1-Bit Compressed Sensing

Namiko Matsumoto; Arya Mazumdar

Binary Iterative Hard Thresholding Converges with Optimal Number of Measurements for 1-Bit Compressed Sensing

Namiko Matsumoto, Arya Mazumdar

TL;DR

This work establishes that binary iterative hard thresholding (BIHT) for 1-bit compressed sensing converges to the true sparse signal using the optimal number of measurements. By introducing the restricted approximate invertibility condition (RAIC) and proving it holds for Gaussian sensing matrices with $m = \tilde{O}(k/\\epsilon)$, the authors obtain a convergence rate where the error decays as $d_{\\mathcal{S}^{n-1}}(\\mathbf{x},\\hat{\\mathbf{x}}^{(t)}) \le 2^{2^{-t}}\\epsilon^{1-2^{-t}}$, and asymptotically reaches the $\\epsilon$-ball. The analysis hinges on a two-regime, distance-aware approach: a large-distance regime analyzed via an $\\varepsilon$-net and a small-distance regime controlled by local binary stable embeddings, enabling a finite-sample guarantee that matches information-theoretic lower bounds. This yields a practical, polynomial-time algorithm with the optimal dependence on both sparsity $k$ and accuracy $\\epsilon$, significantly advancing theory for nonconvex recovery with highly quantized measurements.

Abstract

Compressed sensing has been a very successful high-dimensional signal acquisition and recovery technique that relies on linear operations. However, the actual measurements of signals have to be quantized before storing or processing. 1(One)-bit compressed sensing is a heavily quantized version of compressed sensing, where each linear measurement of a signal is reduced to just one bit: the sign of the measurement. Once enough of such measurements are collected, the recovery problem in 1-bit compressed sensing aims to find the original signal with as much accuracy as possible. The recovery problem is related to the traditional "halfspace-learning" problem in learning theory. For recovery of sparse vectors, a popular reconstruction method from 1-bit measurements is the binary iterative hard thresholding (BIHT) algorithm. The algorithm is a simple projected sub-gradient descent method, and is known to converge well empirically, despite the nonconvexity of the problem. The convergence property of BIHT was not theoretically justified, except with an exorbitantly large number of measurements (i.e., a number of measurement greater than $\max\{k^{10}, 24^{48}, k^{3.5}/ε\}$, where $k$ is the sparsity, $ε$ denotes the approximation error, and even this expression hides other factors). In this paper we show that the BIHT algorithm converges with only $\tilde{O}(\frac{k}ε)$ measurements. Note that, this dependence on $k$ and $ε$ is optimal for any recovery method in 1-bit compressed sensing. With this result, to the best of our knowledge, BIHT is the only practical and efficient (polynomial time) algorithm that requires the optimal number of measurements in all parameters (both $k$ and $ε$). This is also an example of a gradient descent algorithm converging to the correct solution for a nonconvex problem, under suitable structural conditions.

Binary Iterative Hard Thresholding Converges with Optimal Number of Measurements for 1-Bit Compressed Sensing

TL;DR

, the authors obtain a convergence rate where the error decays as

, and asymptotically reaches the

-ball. The analysis hinges on a two-regime, distance-aware approach: a large-distance regime analyzed via an

-net and a small-distance regime controlled by local binary stable embeddings, enabling a finite-sample guarantee that matches information-theoretic lower bounds. This yields a practical, polynomial-time algorithm with the optimal dependence on both sparsity

and accuracy

, significantly advancing theory for nonconvex recovery with highly quantized measurements.

Abstract

, where

is the sparsity,

denotes the approximation error, and even this expression hides other factors). In this paper we show that the BIHT algorithm converges with only

measurements. Note that, this dependence on

and

is optimal for any recovery method in 1-bit compressed sensing. With this result, to the best of our knowledge, BIHT is the only practical and efficient (polynomial time) algorithm that requires the optimal number of measurements in all parameters (both

and

). This is also an example of a gradient descent algorithm converging to the correct solution for a nonconvex problem, under suitable structural conditions.

Paper Structure (40 sections, 18 theorems, 64 equations, 2 figures, 1 algorithm)

This paper contains 40 sections, 18 theorems, 64 equations, 2 figures, 1 algorithm.

Introduction
Our Contribution and Techniques
Other Related Works
Organization
Preliminaries
Notations and Definitions
1-Bit Compressed Sensing and the BIHT Algorithm
Main Results and Techniques
BIHT Convergence Theorem
Technical Overview
The Restricted Approximate Invertibility Condition
Comparison of RAIC and Other Properties of Binary Embeddings
The Uniform Convergence of BIHT Approximations
The RAIC for an i.i.d. Gaussian Matrix
Large- and Small-Distance Regimes -- Steps \ref{['enum:intro:techniques:raic:2']} and \ref{['enum:intro:techniques:raic:3']}
...and 25 more sections

Key Result

Theorem 3.1

Let $a, b, c, d > 0$ be universal constants as in Eq. eqn:univConstants. Fix $\epsilon, \rho \in (0,1)$ and $k, m, n \in \mathbb{Z}_{+}$, where Let the measurement matrix $\mathbf{A} \in \mathbb{R}^{m \times n}$ have rows with i.i.d. Gaussian entries. Then, uniformly with probability at least $1 - \rho$, for every unknown $k$-sparse real-valued unit vector, $\mathbf{x} \in \mathcal{S}^{n-1} \cap

Figures (2)

Figure 1: The left-hand-side shows the error decay of BIHT approximations empirically and theoretically. The right-hand-side displays the fraction of measurements which fall onto opposite sides of the hyperplanes associated with the true signal, $\mathbf{x}$, and the approximations. The empirical results were obtained by running $100$ trials of recovering random $k$-sparse unit vectors via the normalized BIHT algorithm for $25$ iterations. The parameters were set as: $k = 5$, $n = 2000$, $m = 1000$, $\epsilon = 0.05$, and $\rho = 0.05$.
Figure 2: This plot shows the (roughly linear) relationship between the number of measurements, $m$, ($x$-axis) and the inverse error ($y$-axis), where the error is the $\ell_{2}$-distance between the true signal and the approximation obtained after $25$ iterations of the normalized BIHT algorithm. The sparsity and dimension parameters were set, respectively, as: $k = 5$ and $n = 2000$.

Theorems & Definitions (22)

Definition 2.1: Top-$k$ hard thresholding operation
Definition 2.2: Subset hard thresholding operation
Theorem 3.1
Corollary 3.2
Definition 3.1: Restricted approximate invertibility condition (RAIC)
Theorem 3.3
Lemma 4.1
Lemma 4.2
Lemma 4.3
Lemma A.1
...and 12 more

Binary Iterative Hard Thresholding Converges with Optimal Number of Measurements for 1-Bit Compressed Sensing

TL;DR

Abstract

Binary Iterative Hard Thresholding Converges with Optimal Number of Measurements for 1-Bit Compressed Sensing

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (22)