Optimal Compression of Unit Norm Vectors in the High Distortion Regime

Heng Zhu; Avishek Ghosh; Arya Mazumdar

Optimal Compression of Unit Norm Vectors in the High Distortion Regime

Heng Zhu, Avishek Ghosh, Arya Mazumdar

TL;DR

This work analyzes compressing a unit-norm vector in high dimensions under extreme distortion, focusing on worst-case inputs and randomized encoders. It derives fundamental lower bounds for biased ($\delta$-compressor) and unbiased ($\omega$-compressor) schemes and proposes practical, near-optimal compressors: Max Block Norm Quantization (MBNQ) for biased delta and Sparse Randomized Quantization Scheme (SRQS) for unbiased omega, alongside a Gaussian-codebook baseline for both settings. The results establish that $O(d\delta)$ bits are necessary for biased compression and $O(d/\omega)$ for unbiased compression, with practical schemes achieving these rates up to logarithmic factors. The findings have direct implications for reducing communication in federated and distributed learning by enabling near-optimal, scalable vector compression.

Abstract

Motivated by the need for communication-efficient distributed learning, we investigate the method for compressing a unit norm vector into the minimum number of bits, while still allowing for some acceptable level of distortion in recovery. This problem has been explored in the rate-distortion/covering code literature, but our focus is exclusively on the "high-distortion" regime. We approach this problem in a worst-case scenario, without any prior information on the vector, but allowing for the use of randomized compression maps. Our study considers both biased and unbiased compression methods and determines the optimal compression rates. It turns out that simple compression schemes are nearly optimal in this scenario. While the results are a mix of new and known, they are compiled in this paper for completeness.

Optimal Compression of Unit Norm Vectors in the High Distortion Regime

TL;DR

This work analyzes compressing a unit-norm vector in high dimensions under extreme distortion, focusing on worst-case inputs and randomized encoders. It derives fundamental lower bounds for biased (

-compressor) and unbiased (

-compressor) schemes and proposes practical, near-optimal compressors: Max Block Norm Quantization (MBNQ) for biased delta and Sparse Randomized Quantization Scheme (SRQS) for unbiased omega, alongside a Gaussian-codebook baseline for both settings. The results establish that

bits are necessary for biased compression and

for unbiased compression, with practical schemes achieving these rates up to logarithmic factors. The findings have direct implications for reducing communication in federated and distributed learning by enabling near-optimal, scalable vector compression.

Abstract

Paper Structure (20 sections, 8 theorems, 56 equations, 1 table, 4 algorithms)

This paper contains 20 sections, 8 theorems, 56 equations, 1 table, 4 algorithms.

Introduction
Our contributions
An optimal but inefficient biased $\delta$-compressor
A near-optimal and efficient biased $\delta$-compressor
A near-optimal and efficient biased $\omega$-compressor
Related Work
$\epsilon$-nets
Quantization
Federated Learning and Communication Cost
Biased $\delta$-Compressors
Lower Bound
Random Gaussian Codebook Scheme
Max Block Norm Quantization (MBNQ)
Unbiased $\omega$-Compressors
Lower Bound
...and 5 more sections

Key Result

Proposition 1

Let $Q(v)$ be the compressor of $v$ satisfying Definition def biased. Then the number of bits $b$ required to transmit $Q(v)$ satisfy

Theorems & Definitions (18)

Definition 1: Unbiased $\omega$-compressor
Definition 2: Biased $\delta$-compressor
Proposition 1
proof
Theorem 1
Remark 1
Theorem 2
proof
Remark 2
Theorem 3: chen2020breaking, Appendix D.2
...and 8 more

Optimal Compression of Unit Norm Vectors in the High Distortion Regime

TL;DR

Abstract

Optimal Compression of Unit Norm Vectors in the High Distortion Regime

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (18)