Table of Contents
Fetching ...

Average-Distortion Sketching

Yiqiao Bao, Anubhav Baweja, Nicolas Menand, Erik Waingarten, Nathan White, Tian Zhang

TL;DR

This work introduces average-distortion sketching, a distribution-aware relaxation of worst-case metric sketching, to compress metric-space points while preserving average pairwise distances with respect to a fixed distribution μ. For ℓ_p spaces with p>2, the authors construct average-distortion sketches that achieve a constant distortion using 2^{O(p/c)}·log^2(dΔ) bits, outperforming worst-case sketch bounds and enabling improved approximate nearest neighbor structures. They establish an asymmetric sketch variant to reduce space on one input, and provide a lower-bound argument in a certificate model, suggesting exponential dependence on p/c may be necessary. By connecting average-distortion sketches to NN data structures and data-dependent hashing ideas, the paper demonstrates practical improvements for large-p regimes and lays out open problems regarding broader metric classes and tighter bounds. Overall, the results indicate average-distortion sketching can surpass traditional worst-case barriers and offer scalable NN solutions tailored to data distributions.

Abstract

We introduce average-distortion sketching for metric spaces. As in (worst-case) sketching, these algorithms compress points in a metric space while approximately recovering pairwise distances. The novelty is studying average-distortion: for any fixed (yet, arbitrary) distribution $μ$ over the metric, the sketch should not over-estimate distances, and it should (approximately) preserve the average distance with respect to draws from $μ$. The notion generalizes average-distortion embeddings into $\ell_1$ [Rabinovich '03, Kush-Nikolov-Tang '21] as well as data-dependent locality-sensitive hashing [Andoni-Razenshteyn '15, Andoni-Naor-Nikolov-et-al. '18], which have been recently studied in the context of nearest neighbor search. $\bullet$ For all $p \in (2, \infty)$ and any $c$ larger than a fixed constant, we give an average-distortion sketch for $([Δ]^d, \ell_p)$ with approximation $c$ and bit-complexity $\text{poly}(2^{p/c} \cdot \log(dΔ))$, which is provably impossible in (worst-case) sketching. $\bullet$ As an application, we improve on the approximation of sublinear-time data structures for nearest neighbor search over $\ell_p$ (for large $p > 2$). The prior best approximation was $O(p)$ [Andoni-Naor-Nikolov-et-al. '18, Kush-Nikolov-Tang '21], and we show it can be any $c$ larger than a fixed constant (irrespective of $p$) by using $n^{O(p/c)}$ space. We give some evidence that $2^{Ω(p/c)}$ space may be necessary by giving a lower bound on average-distortion sketches which produce a certain probabilistic certificate of farness (which our sketches crucially rely on).

Average-Distortion Sketching

TL;DR

This work introduces average-distortion sketching, a distribution-aware relaxation of worst-case metric sketching, to compress metric-space points while preserving average pairwise distances with respect to a fixed distribution μ. For ℓ_p spaces with p>2, the authors construct average-distortion sketches that achieve a constant distortion using 2^{O(p/c)}·log^2(dΔ) bits, outperforming worst-case sketch bounds and enabling improved approximate nearest neighbor structures. They establish an asymmetric sketch variant to reduce space on one input, and provide a lower-bound argument in a certificate model, suggesting exponential dependence on p/c may be necessary. By connecting average-distortion sketches to NN data structures and data-dependent hashing ideas, the paper demonstrates practical improvements for large-p regimes and lays out open problems regarding broader metric classes and tighter bounds. Overall, the results indicate average-distortion sketching can surpass traditional worst-case barriers and offer scalable NN solutions tailored to data distributions.

Abstract

We introduce average-distortion sketching for metric spaces. As in (worst-case) sketching, these algorithms compress points in a metric space while approximately recovering pairwise distances. The novelty is studying average-distortion: for any fixed (yet, arbitrary) distribution over the metric, the sketch should not over-estimate distances, and it should (approximately) preserve the average distance with respect to draws from . The notion generalizes average-distortion embeddings into [Rabinovich '03, Kush-Nikolov-Tang '21] as well as data-dependent locality-sensitive hashing [Andoni-Razenshteyn '15, Andoni-Naor-Nikolov-et-al. '18], which have been recently studied in the context of nearest neighbor search. For all and any larger than a fixed constant, we give an average-distortion sketch for with approximation and bit-complexity , which is provably impossible in (worst-case) sketching. As an application, we improve on the approximation of sublinear-time data structures for nearest neighbor search over (for large ). The prior best approximation was [Andoni-Naor-Nikolov-et-al. '18, Kush-Nikolov-Tang '21], and we show it can be any larger than a fixed constant (irrespective of ) by using space. We give some evidence that space may be necessary by giving a lower bound on average-distortion sketches which produce a certain probabilistic certificate of farness (which our sketches crucially rely on).

Paper Structure

This paper contains 22 sections, 20 theorems, 49 equations, 4 figures.

Key Result

Theorem 1

For any $c$ greater than a fixed universal constant and any $p \in (2, \infty)$, there exists an average-distortion sketch for any distribution over $([\Delta]^d, \ell_p)$ with distortion $c$ using $2^{O(p/c)}\cdot\log^2(d\Delta)$ bits.In fact, for any $x,y \in [\Delta]^d$, a sketch using $2^{O(p/c)

Figures (4)

  • Figure 1: Single-Scale Sketch for $([\Delta]^d, \ell_{p})$
  • Figure 2: Accompanying diagram for the proof of Lemma \ref{['lem:bad-events']}. The events are labeled as nodes, leading to, either the cases considered in Lemma \ref{['lem:bad-events']}, or two nodes labeled $\mathsf{FAR}$ where the sketch will output $\mathsf{FAR}$. Edges are labelled "T" or "F", corresponding to whether the events in nodes hold (in the case "T") or do not hold (in the case "F").
  • Figure 3: Core-Preprocess Subroutine.
  • Figure 4: Core-Query Subroutine.

Theorems & Definitions (53)

  • Definition 1.1: Average-Distortion Sketches
  • Theorem 1: Average-Distortion Sketching for $\ell_p$
  • Theorem 2: Approximate Nearest Neighbor in $\ell_p$
  • Lemma 2.1
  • Remark 2.2
  • Lemma 2.3
  • Lemma 2.4
  • proof
  • proof : Proof of \ref{['lem:close']}
  • Lemma 2.5
  • ...and 43 more