Table of Contents
Fetching ...

Quantum Sketches, Hashing, and Approximate Nearest Neighbors

Sajjad Hashemian

TL;DR

A broad quantum sketch model for candidate-scanning abstractions of hashing-based ANN, where amplitude amplification yields a quadratic reduction in candidate checks, which is essentially optimal by Grover/BBBV-type bounds.

Abstract

Motivated by Johnson--Lindenstrauss dimension reduction, amplitude encoding, and the view of measurements as hash-like primitives, one might hope to compress an $n$-point approximate nearest neighbor (ANN) data structure into $O(\log n)$ qubits. We rule out this possibility in a broad quantum sketch model, the dataset $P$ is encoded as an $m$-qubit state $ρ_P$, and each query is answered by an arbitrary query-dependent measurement on a fresh copy of $ρ_P$. For every approximation factor $c\ge 1$ and constant success probability $p>1/2$, we exhibit $n$-point instances in Hamming space $\{0,1\}^d$ with $d=Θ(\log n)$ for which any such sketch requires $m=Ω(n)$ qubits, via a reduction to quantum random access codes and Nayak's lower bound. These memory lower bounds coexist with potential quantum query-time gains and in candidate-scanning abstractions of hashing-based ANN, amplitude amplification yields a quadratic reduction in candidate checks, which is essentially optimal by Grover/BBBV-type bounds.

Quantum Sketches, Hashing, and Approximate Nearest Neighbors

TL;DR

A broad quantum sketch model for candidate-scanning abstractions of hashing-based ANN, where amplitude amplification yields a quadratic reduction in candidate checks, which is essentially optimal by Grover/BBBV-type bounds.

Abstract

Motivated by Johnson--Lindenstrauss dimension reduction, amplitude encoding, and the view of measurements as hash-like primitives, one might hope to compress an -point approximate nearest neighbor (ANN) data structure into qubits. We rule out this possibility in a broad quantum sketch model, the dataset is encoded as an -qubit state , and each query is answered by an arbitrary query-dependent measurement on a fresh copy of . For every approximation factor and constant success probability , we exhibit -point instances in Hamming space with for which any such sketch requires qubits, via a reduction to quantum random access codes and Nayak's lower bound. These memory lower bounds coexist with potential quantum query-time gains and in candidate-scanning abstractions of hashing-based ANN, amplitude amplification yields a quadratic reduction in candidate checks, which is essentially optimal by Grover/BBBV-type bounds.
Paper Structure (8 sections, 7 theorems, 23 equations, 1 figure)

This paper contains 8 sections, 7 theorems, 23 equations, 1 figure.

Key Result

Theorem 1

If an $(n,m,p)$-QRAC exists with $p\ge 1/2$, then In particular, for any constant $p>1/2$, one has $m=\Omega(n)$.

Figures (1)

  • Figure 1: Intuition for the lower-bound construction. Pick $C(1),\dots,C(n)\in\{0,1\}^m$ so that for all $i\neq j$, $\mathrm{Ham}(C(i),C(j))\ge m/4$. Then lift each $C(i)$ into a tight pair $u_i=(C(i),0)$ and $v_i=(C(i),1)$ in dimension $d=m{+}1$. The last coordinate creates a distance-$1$ toggle that encodes a bit, while the underlying code separation ensures that all points from different indices remain far apart, enabling the forcing lemma used in the reduction to QRACs.

Theorems & Definitions (20)

  • Definition 1: $c$-Approximate Nearest Neighbor
  • Definition 2
  • Remark 1
  • Definition 3: Quantum random access code
  • Theorem 1: Lower bound for QRACs Nay99
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Theorem 2
  • ...and 10 more