Binary Iterative Hard Thresholding Converges with Optimal Number of Measurements for 1-Bit Compressed Sensing
Namiko Matsumoto, Arya Mazumdar
TL;DR
This work establishes that binary iterative hard thresholding (BIHT) for 1-bit compressed sensing converges to the true sparse signal using the optimal number of measurements. By introducing the restricted approximate invertibility condition (RAIC) and proving it holds for Gaussian sensing matrices with $m = \tilde{O}(k/\\epsilon)$, the authors obtain a convergence rate where the error decays as $d_{\\mathcal{S}^{n-1}}(\\mathbf{x},\\hat{\\mathbf{x}}^{(t)}) \le 2^{2^{-t}}\\epsilon^{1-2^{-t}}$, and asymptotically reaches the $\\epsilon$-ball. The analysis hinges on a two-regime, distance-aware approach: a large-distance regime analyzed via an $\\varepsilon$-net and a small-distance regime controlled by local binary stable embeddings, enabling a finite-sample guarantee that matches information-theoretic lower bounds. This yields a practical, polynomial-time algorithm with the optimal dependence on both sparsity $k$ and accuracy $\\epsilon$, significantly advancing theory for nonconvex recovery with highly quantized measurements.
Abstract
Compressed sensing has been a very successful high-dimensional signal acquisition and recovery technique that relies on linear operations. However, the actual measurements of signals have to be quantized before storing or processing. 1(One)-bit compressed sensing is a heavily quantized version of compressed sensing, where each linear measurement of a signal is reduced to just one bit: the sign of the measurement. Once enough of such measurements are collected, the recovery problem in 1-bit compressed sensing aims to find the original signal with as much accuracy as possible. The recovery problem is related to the traditional "halfspace-learning" problem in learning theory. For recovery of sparse vectors, a popular reconstruction method from 1-bit measurements is the binary iterative hard thresholding (BIHT) algorithm. The algorithm is a simple projected sub-gradient descent method, and is known to converge well empirically, despite the nonconvexity of the problem. The convergence property of BIHT was not theoretically justified, except with an exorbitantly large number of measurements (i.e., a number of measurement greater than $\max\{k^{10}, 24^{48}, k^{3.5}/ε\}$, where $k$ is the sparsity, $ε$ denotes the approximation error, and even this expression hides other factors). In this paper we show that the BIHT algorithm converges with only $\tilde{O}(\frac{k}ε)$ measurements. Note that, this dependence on $k$ and $ε$ is optimal for any recovery method in 1-bit compressed sensing. With this result, to the best of our knowledge, BIHT is the only practical and efficient (polynomial time) algorithm that requires the optimal number of measurements in all parameters (both $k$ and $ε$). This is also an example of a gradient descent algorithm converging to the correct solution for a nonconvex problem, under suitable structural conditions.
