Table of Contents
Fetching ...

$(1+\varepsilon)$-ANN Data Structure for Curves via Subspaces of Bounded Doubling Dimension

Jacobus Conradi, Anne Driemel, Benedikt Kolbe

TL;DR

The paper tackles the $(1+\varepsilon)$-ANN problem for polygonal curves under the Fréchet distance, addressing the obstacle of unbounded doubling dimension in the full curve space. It introduces a closely related subspace ${\mathbb X}^*$ with bounded doubling dimension and uses this to port ANN techniques from spaces with low doubling dimension, yielding a data structure with preprocessing $F(d,k,S,\varepsilon)n\log n$, space $F(d,k,S,\varepsilon)n$, and query time $F(d,k,S,\varepsilon)\log n + F(d,k,S,\varepsilon)^{-\log(\varepsilon)}$, where $F(d,k,S,\varepsilon)=O\left(2^{O(d)}k\Phi(S)\varepsilon^{-1}\right)^k$. The work extends to $c$-packed curves with improved bounds and develops a general tractably nearly-doubling framework to transfer results to spaces with unbounded doubling dimension via suitable subspaces and Gromov-Hausdorff proximity. It also establishes lower bounds demonstrating near-tightness and outlines a pathway to generalize these results to arbitrary metric spaces. Overall, the approach enables efficient approximate querying for curve data under Fréchet distance by reducing to a bounded-dimension setting and then applying established doubling-dimension ANN techniques, with implications for high-dimensional motion data and related geometric queries.

Abstract

We consider the $(1+\varepsilon)$-Approximate Nearest Neighbour (ANN) Problem for polygonal curves in $d$-dimensional space under the Fréchet distance and ask to what extent known data structures for doubling spaces can be applied to this problem. Initially, this approach does not seem viable, since the doubling dimension of the target space is known to be unbounded -- even for well-behaved polygonal curves of constant complexity in one dimension. In order to overcome this, we identify a subspace of curves which has bounded doubling dimension and small Gromov-Hausdorff distance to the target space. We then apply state-of-the-art techniques for doubling spaces and show how to obtain a data structure for the $(1+\varepsilon)$-ANN problem for any set of parametrized polygonal curves. The expected preprocessing time needed to construct the data-structure is $F(d,k,S,\varepsilon)n\log n$ and the space used is $F(d,k,S,\varepsilon)n$, with a query time of $F(d,k,S,\varepsilon)\log n + F(d,k,S,\varepsilon)^{-\log(\varepsilon)}$, where $F(d,k,S,\varepsilon)=O\left(2^{O(d)}kΦ(S)\varepsilon^{-1}\right)^k$ and $Φ(S)$ denotes the spread of the set of vertices and edges of the curves in $S$. We extend these results to the realistic class of $c$-packed curves and show improved bounds for small values of $c$.

$(1+\varepsilon)$-ANN Data Structure for Curves via Subspaces of Bounded Doubling Dimension

TL;DR

The paper tackles the -ANN problem for polygonal curves under the Fréchet distance, addressing the obstacle of unbounded doubling dimension in the full curve space. It introduces a closely related subspace with bounded doubling dimension and uses this to port ANN techniques from spaces with low doubling dimension, yielding a data structure with preprocessing , space , and query time , where . The work extends to -packed curves with improved bounds and develops a general tractably nearly-doubling framework to transfer results to spaces with unbounded doubling dimension via suitable subspaces and Gromov-Hausdorff proximity. It also establishes lower bounds demonstrating near-tightness and outlines a pathway to generalize these results to arbitrary metric spaces. Overall, the approach enables efficient approximate querying for curve data under Fréchet distance by reducing to a bounded-dimension setting and then applying established doubling-dimension ANN techniques, with implications for high-dimensional motion data and related geometric queries.

Abstract

We consider the -Approximate Nearest Neighbour (ANN) Problem for polygonal curves in -dimensional space under the Fréchet distance and ask to what extent known data structures for doubling spaces can be applied to this problem. Initially, this approach does not seem viable, since the doubling dimension of the target space is known to be unbounded -- even for well-behaved polygonal curves of constant complexity in one dimension. In order to overcome this, we identify a subspace of curves which has bounded doubling dimension and small Gromov-Hausdorff distance to the target space. We then apply state-of-the-art techniques for doubling spaces and show how to obtain a data structure for the -ANN problem for any set of parametrized polygonal curves. The expected preprocessing time needed to construct the data-structure is and the space used is , with a query time of , where and denotes the spread of the set of vertices and edges of the curves in . We extend these results to the realistic class of -packed curves and show improved bounds for small values of .
Paper Structure (16 sections, 29 theorems, 14 equations, 4 figures)

This paper contains 16 sections, 29 theorems, 14 equations, 4 figures.

Key Result

Theorem 6

Given a set $S$ of $n$ polygonal curves in ${\mathbb X}^{d,k}_{\Lambda}$ and parameters $0<\varepsilon< 1$ and $\varepsilon'>0$, one can construct a data structure that for given $q\in{\mathbb X}^{d,k}$ outputs an element $s^*\in S$ such that for all $s\in S$ it holds that $\mathrm{d}_\mathcal{F}(s^

Figures (4)

  • Figure 1: Example of a curve $P\in{\mathbb X}^{d,k}_{\Lambda}$ in blue, and an $\varepsilon$-curve close to $P$ resulting from Lemma \ref{['lem:simplification']} in red.
  • Figure 2: Illustration of the set $L_{\lambda,\Delta}(P,s)$ in dark green, together with the points $p$ and $P(t)$ realizing a point $q$ in $L_{\lambda,\Delta}(P,s)$, that is $\|p-q\|=\lambda$ and $\mathrm{d}_\mathcal{F}(P[s,t],\overline{p\,q})\leq\Delta$.
  • Figure 4: Illustration of the construction from Lemma \ref{['lem:lower']} of two $(5,1)$-curves $G_{(2,6,10)}$ and $G_{(5,7,12)}$ in ${\mathbb X}^{1,9}$ that have Fréchet distance $1/2$ to the center curve $C$ at the top.
  • Figure 5: Example of a subset ${\mathbb Z}$ of the metric space ${\mathbb R}$ whose doubling dimension is larger than that of its ambient space. The disk centers are marked by circles.

Theorems & Definitions (43)

  • Definition 1: polygonal curve
  • Definition 2: Fréchet distance
  • Definition 3
  • Definition 4: discrete Fréchet distance
  • Theorem 6
  • Definition 7: bundledness
  • Definition 8: spread
  • Theorem 9
  • Corollary 9
  • Definition 10: doubling constant and dimension
  • ...and 33 more