Table of Contents
Fetching ...

Adaptive Robustness of Hypergrid Johnson-Lindenstrauss

Andrej Bogdanov, Alon Rosen, Neekon Vafa, Vinod Vaikuntanathan

TL;DR

This work extends Johnson-Lindenstrauss dimension reduction to the contracting hypergrid, formulating the Contracting Hypergrid Vector problem (CHV) and identifying a pronounced statistical–computational gap: κ_stat = Θ((2B+1)^{-n/m}) versus κ_comp = ~Θ((1/B)√(m/n)). It develops two algorithms—kernel-based rounding and an online discrepancy-inspired method—that achieve κ = O((1/B)√(m/n)) under suitable regimes, and proves hardness via multi-overlap gap properties and lattice-based reductions, including CLWE-based arguments. A cryptographic angle shows that rounded JL embeddings yield robust property-preserving hashes for Euclidean distance, providing collision-resistant, distance-preserving compression against computationally bounded adversaries. The results contribute both algorithmic insight into adaptive robustness for high-dimensional embeddings and cryptographic primitives that exploit the statistical–computational gap. Overall, the paper links geometric embedding, online algorithmics, average-case hardness, and cryptographic applications in a coherent CHV framework.

Abstract

Johnson and Lindenstrauss (Contemporary Mathematics, 1984) showed that for $n > m$, a scaled random projection $\mathbf{A}$ from $\mathbb{R}^n$ to $\mathbb{R}^m$ is an approximate isometry on any set $S$ of size at most exponential in $m$. If $S$ is larger, however, its points can contract arbitrarily under $\mathbf{A}$. In particular, the hypergrid $([-B, B] \cap \mathbb{Z})^n$ is expected to contain a point that is contracted by a factor of $κ_{\mathsf{stat}} = Θ(B)^{-1/α}$, where $α= m/n$. We give evidence that finding such a point exhibits a statistical-computational gap precisely up to $κ_{\mathsf{comp}} = \widetildeΘ(\sqrtα/B)$. On the algorithmic side, we design an online algorithm achieving $κ_{\mathsf{comp}}$, inspired by a discrepancy minimization algorithm of Bansal and Spencer (Random Structures & Algorithms, 2020). On the hardness side, we show evidence via a multiple overlap gap property (mOGP), which in particular captures online algorithms; and a reduction-based lower bound, which shows hardness under standard worst-case lattice assumptions. As a cryptographic application, we show that the rounded Johnson-Lindenstrauss embedding is a robust property-preserving hash function (Boyle, Lavigne and Vaikuntanathan, TCC 2019) on the hypergrid for the Euclidean metric in the computationally hard regime. Such hash functions compress data while preserving $\ell_2$ distances between inputs up to some distortion factor, with the guarantee that even knowing the hash function, no computationally bounded adversary can find any pair of points that violates the distortion bound.

Adaptive Robustness of Hypergrid Johnson-Lindenstrauss

TL;DR

This work extends Johnson-Lindenstrauss dimension reduction to the contracting hypergrid, formulating the Contracting Hypergrid Vector problem (CHV) and identifying a pronounced statistical–computational gap: κ_stat = Θ((2B+1)^{-n/m}) versus κ_comp = ~Θ((1/B)√(m/n)). It develops two algorithms—kernel-based rounding and an online discrepancy-inspired method—that achieve κ = O((1/B)√(m/n)) under suitable regimes, and proves hardness via multi-overlap gap properties and lattice-based reductions, including CLWE-based arguments. A cryptographic angle shows that rounded JL embeddings yield robust property-preserving hashes for Euclidean distance, providing collision-resistant, distance-preserving compression against computationally bounded adversaries. The results contribute both algorithmic insight into adaptive robustness for high-dimensional embeddings and cryptographic primitives that exploit the statistical–computational gap. Overall, the paper links geometric embedding, online algorithmics, average-case hardness, and cryptographic applications in a coherent CHV framework.

Abstract

Johnson and Lindenstrauss (Contemporary Mathematics, 1984) showed that for , a scaled random projection from to is an approximate isometry on any set of size at most exponential in . If is larger, however, its points can contract arbitrarily under . In particular, the hypergrid is expected to contain a point that is contracted by a factor of , where . We give evidence that finding such a point exhibits a statistical-computational gap precisely up to . On the algorithmic side, we design an online algorithm achieving , inspired by a discrepancy minimization algorithm of Bansal and Spencer (Random Structures & Algorithms, 2020). On the hardness side, we show evidence via a multiple overlap gap property (mOGP), which in particular captures online algorithms; and a reduction-based lower bound, which shows hardness under standard worst-case lattice assumptions. As a cryptographic application, we show that the rounded Johnson-Lindenstrauss embedding is a robust property-preserving hash function (Boyle, Lavigne and Vaikuntanathan, TCC 2019) on the hypergrid for the Euclidean metric in the computationally hard regime. Such hash functions compress data while preserving distances between inputs up to some distortion factor, with the guarantee that even knowing the hash function, no computationally bounded adversary can find any pair of points that violates the distortion bound.

Paper Structure

This paper contains 29 sections, 17 theorems, 105 equations, 5 figures.

Key Result

Theorem 1

For all $n > m$ and $\mathbf{A} \sim \mathcal{N}(0,1)^{m \times n}$, scaling and rounding a random vector in $\ker(\mathbf{A})$ yields $\mathbf{x} \in ([-B, B] \cap \mathbb{Z})^n$ such that with probability $1-o(1)$. This directly solves $\mathsf{CHV}$ where with probability $1 - o(1)$.

Figures (5)

  • Figure 1: Phase diagram of $\mathsf{CHV}$ in the asymptotic regime (up to lower-order additive terms). The blue and red boundaries are $\ln \kappa^{-1} = \tfrac{1}{2} \ln \alpha^{-1} + \ln B$ and $\ln \kappa^{-1} = \alpha^{-1} \ln B$, respectively.
  • Figure 2: Online Norm Minimization Algorithm $\mathsf{Cool}$, as analyzed in \ref{['thm:cool']}.
  • Figure 3: A sample realization of the stochastic process \ref{['eq:process']} with $m = 50$ with fixed temperature $b$. Since $b$ is fixed, $L_t$ is homogeneously linear in $b$, so the stochastic process $L_t/(bm)$ is independent of $b$.
  • Figure 4: Kernel Rounding Algorithm for $\mathsf{CHV}$, as analyzed in \ref{['thm:roundkernel']}.
  • Figure 5: The construction of the robust locality sensitive hash function, as used in \ref{['main-construction']}. See \ref{['def-of-our-hash-primitive']} for the syntax of robust locality sensitive hash functions.

Theorems & Definitions (59)

  • Definition 1: Contracting Hypergrid Vector
  • Theorem 1: Informal Version of Theorem \ref{['thm:roundkernel']}
  • Theorem 2: Informal Version of Theorem \ref{['thm:cool']}
  • Corollary 1
  • proof : Proof of \ref{['alg-cor-in-intro']} given \ref{['informal-cool', 'thm:informal-roundkernel']}
  • Theorem 3: Informal Version of \ref{['thm:ogp']}
  • Theorem 4: Informal Version of \ref{['lattice-based-lower-bound']}
  • Theorem 5: Informal Version of \ref{['main-construction']}
  • Corollary 2: Informal
  • Definition 2: Contracting Hypergrid Vector Problem ($\mathsf{CHV}$)
  • ...and 49 more