Optimal Quantized Compressed Sensing via Projected Gradient Descent

Junren Chen; Ming Yuan

Optimal Quantized Compressed Sensing via Projected Gradient Descent

Junren Chen, Ming Yuan

TL;DR

This work develops a unified framework for recovering structured signals from quantized measurements $\mathbf{y}=\mathcal{Q}(\mathbf{A}\mathbf{x}-\bm{\tau})$ by employing projected gradient descent on a one-sided $\ell_1$-loss and establishing a restricted approximate invertibility condition (RAIC) that guarantees convergence with error close to information-theoretic limits. The method applies broadly to 1-bit CS, dithered 1-bit CS, and dithered multi-bit CS, with extensions to low-rank matrix recovery; the analysis combines sharp concentration bounds, gradient clipping for multi-bit settings, and a novel product-embedding approach to achieve global convergence in the multi-bit regime. Information-theoretic bounds are integrated with algorithmic guarantees to show that PGD attains the optimal or near-optimal rates, including $\tilde{O}(\frac{k}{mL})$ for $k$-sparse signals and $\tilde{O}((\frac{k}{mL})^{1/3})$ for effectively sparse signals, as well as analogous results for low-rank matrices. The results offer an efficient, scalable alternative to intractable decoders like constrained Hamming distance minimization, providing practical near-optimal recovery in a broad class of quantized sensing models.

Abstract

This paper provides a unified treatment to the recovery of structured signals living in a star-shaped set from general quantized measurements $\mathcal{Q}(\mathbf{A}\mathbf{x}-\mathbfτ)$, where $\mathbf{A}$ is a sensing matrix, $\mathbfτ$ is a vector of (possibly random) quantization thresholds, and $\mathcal{Q}$ denotes an $L$-level quantizer. The ideal estimator with consistent quantized measurements is optimal in some important instances but typically infeasible to compute. To this end, we study the projected gradient descent (PGD) algorithm with respect to the one-sided $\ell_1$-loss and identify the conditions under which PGD achieves the same error rate, up to logarithmic factors. These conditions include estimates of the separation probability, small-ball probability and some moment bounds that are easy to validate. For multi-bit case, we also develop a complementary approach based on product embedding to show global convergence. When applied to popular models such as 1-bit compressed sensing with Gaussian $\mathbf{A}$ and zero $\mathbfτ$ and the dithered 1-bit/multi-bit models with sub-Gaussian $\mathbf{A}$ and uniform dither $\mathbfτ$, our unified treatment yields error rates that improve on or match the sharpest results in all instances. Particularly, PGD achieves the information-theoretic optimal rate $\tilde{O}(\frac{k}{mL})$ for recovering $k$-sparse signals, and the rate $\tilde{O}((\frac{k}{mL})^{1/3})$ for effectively sparse signals. For 1-bit compressed sensing of sparse signals, our result recovers the optimality of normalized binary iterative hard thresholding (NBIHT) that was proved very recently.

Optimal Quantized Compressed Sensing via Projected Gradient Descent

TL;DR

This work develops a unified framework for recovering structured signals from quantized measurements

by employing projected gradient descent on a one-sided

-loss and establishing a restricted approximate invertibility condition (RAIC) that guarantees convergence with error close to information-theoretic limits. The method applies broadly to 1-bit CS, dithered 1-bit CS, and dithered multi-bit CS, with extensions to low-rank matrix recovery; the analysis combines sharp concentration bounds, gradient clipping for multi-bit settings, and a novel product-embedding approach to achieve global convergence in the multi-bit regime. Information-theoretic bounds are integrated with algorithmic guarantees to show that PGD attains the optimal or near-optimal rates, including

for

-sparse signals and

for effectively sparse signals, as well as analogous results for low-rank matrices. The results offer an efficient, scalable alternative to intractable decoders like constrained Hamming distance minimization, providing practical near-optimal recovery in a broad class of quantized sensing models.

Abstract

This paper provides a unified treatment to the recovery of structured signals living in a star-shaped set from general quantized measurements

, where

is a sensing matrix,

is a vector of (possibly random) quantization thresholds, and

denotes an

-level quantizer. The ideal estimator with consistent quantized measurements is optimal in some important instances but typically infeasible to compute. To this end, we study the projected gradient descent (PGD) algorithm with respect to the one-sided

-loss and identify the conditions under which PGD achieves the same error rate, up to logarithmic factors. These conditions include estimates of the separation probability, small-ball probability and some moment bounds that are easy to validate. For multi-bit case, we also develop a complementary approach based on product embedding to show global convergence. When applied to popular models such as 1-bit compressed sensing with Gaussian

and zero

and the dithered 1-bit/multi-bit models with sub-Gaussian

and uniform dither

, our unified treatment yields error rates that improve on or match the sharpest results in all instances. Particularly, PGD achieves the information-theoretic optimal rate

for recovering

-sparse signals, and the rate

for effectively sparse signals. For 1-bit compressed sensing of sparse signals, our result recovers the optimality of normalized binary iterative hard thresholding (NBIHT) that was proved very recently.

Paper Structure (70 sections, 29 theorems, 264 equations, 6 figures, 2 tables, 1 algorithm)

This paper contains 70 sections, 29 theorems, 264 equations, 6 figures, 2 tables, 1 algorithm.

Introduction
1-bit compressed sensing (1bCS).
Dithered 1-bit compressed sensing (D1bCS).
Dithered multi-bit compressed sensing (DMbCS).
Information-Theoretic Bounds
Projected Gradient Descent
Convergence via RAIC
Auxiliary assumptions for Theorem \ref{['thm:convergence']}:
A Sharp Approach to RAIC
Gradient clipping
Bound the first term on the right-hand side of (\ref{['eq:clipping2']}):
Orthogonal decomposition
Sharp concentration bounds
RAIC and Main Theorem
Auxiliary Assumptions for Theorem \ref{['thm:main']}:
...and 55 more sections

Key Result

Theorem 1

Suppose there is a $K$-dimensional linear subspace $V_K$ contained in $\mathcal{K}$. Given $\mathbf{A}\in \mathbb{R}^{m\times n}$, $\bm{\tau}\in \mathbb{R}^m$ and $\beta>\alpha\ge 0$, our goal is to recover $\mathbf{x}\in \mathcal{W}:=\mathcal{K}\cap\mathbb{A}_{\alpha,\beta}$ from $\mathbf{y}=\mathc

Figures (6)

Figure 1: An example of $(L=4)$-level quantizer. Note that we can assume (\ref{['wolg']}) because, for instance in the right figure, we can always modify $\{q_i\}_{i=1}^4$ to $\{q_i^{new}\}_{i=1}^4.$ Similarly, in 1-bit case where $\Delta=2$, we can always assume $(q_1,q_2)=(-1,1)$ to fulfill (\ref{['wolg']}).
Figure 2: The uniform quantizer $\mathcal{Q}_\delta$ and its saturated version $\mathcal{Q}_{\delta,4}$ under the resolution $\delta=1$.
Figure 3: RAIC indicates closeness of the ideal step and the actual subgradient step.
Figure 4: In 1bCS, PGD achieves error rates $\tilde{O}(\frac{k}{m})$ for $\mathbf{x}\in \Sigma^n_k$ (left), $\tilde{O}(\frac{\bar{r}(n_1+n_2)}{m})$ for $\mathbf{x}\in M^{n_1,n_2}_{\bar{r}}$ (middle), and $\tilde{O}((\frac{k}{m})^{1/3})$ for $\mathbf{x}\in \sqrt{k}\mathbb{B}_1^n\cap\mathbb{S}^{n-1}$ (right). In sparse recovery, we recover $k$-sparse $500$-dimensional signals under $m=c\cdot\text{ml}$ with $\text{ml}=400:200:1200$. In low-rank recovery, we recover rank-$\bar{r}$$25\times 25$ matrices under $m=c\cdot\text{ml}$ with $\text{ml}=400:200:1200$. In recovering effectively sparse signals, we test $300$-dimensional signals under $m=c\cdot\text{ml}$ with $\text{ml}=800:400:2400$.
Figure 5: In D1bCS, PGD achieves error rates $\tilde{O}(\frac{k}{m})$ for $\mathbf{x}\in \Sigma^n_k$ (left), $\tilde{O}(\frac{\bar{r}(n_1+n_2)}{m})$ for $\mathbf{x}\in M^{n_1,n_2}_{\bar{r}}$ (middle), and $\tilde{O}((\frac{k}{m})^{1/3})$ for $\mathbf{x}\in \sqrt{k}\mathbb{B}_1^n\cap\mathbb{S}^{n-1}$ (right). In sparse recovery, we recover $k$-sparse $500$-dimensional signals under $m=c\cdot\text{ml}$ with $\text{ml}=400:200:1600$. In low-rank recovery, we recover rank-$\bar{r}$$25\times 25$ matrices under measurement number $m=c\cdot\text{ml}$ with $\text{ml}=600:200:1600$ with $\Lambda=1.5$. In recovering effectively sparse signals, we test $300$-dimensional signals under $m=c\cdot\text{ml}$ with $\text{ml}=800:400:2400$.
...and 1 more figures

Theorems & Definitions (85)

Theorem 1: Information-theoretic lower bound
Remark 1
Remark 2
Remark 3
Theorem 2: Quantized embedding property
Theorem 3: Information-theoretic upper bound
proof
Remark 4
Remark 5
Definition 1: RAIC
...and 75 more

Optimal Quantized Compressed Sensing via Projected Gradient Descent

TL;DR

Abstract

Optimal Quantized Compressed Sensing via Projected Gradient Descent

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (85)