$(1+\varepsilon)$-ANN Data Structure for Curves via Subspaces of Bounded Doubling Dimension
Jacobus Conradi, Anne Driemel, Benedikt Kolbe
TL;DR
The paper tackles the $(1+\varepsilon)$-ANN problem for polygonal curves under the Fréchet distance, addressing the obstacle of unbounded doubling dimension in the full curve space. It introduces a closely related subspace ${\mathbb X}^*$ with bounded doubling dimension and uses this to port ANN techniques from spaces with low doubling dimension, yielding a data structure with preprocessing $F(d,k,S,\varepsilon)n\log n$, space $F(d,k,S,\varepsilon)n$, and query time $F(d,k,S,\varepsilon)\log n + F(d,k,S,\varepsilon)^{-\log(\varepsilon)}$, where $F(d,k,S,\varepsilon)=O\left(2^{O(d)}k\Phi(S)\varepsilon^{-1}\right)^k$. The work extends to $c$-packed curves with improved bounds and develops a general tractably nearly-doubling framework to transfer results to spaces with unbounded doubling dimension via suitable subspaces and Gromov-Hausdorff proximity. It also establishes lower bounds demonstrating near-tightness and outlines a pathway to generalize these results to arbitrary metric spaces. Overall, the approach enables efficient approximate querying for curve data under Fréchet distance by reducing to a bounded-dimension setting and then applying established doubling-dimension ANN techniques, with implications for high-dimensional motion data and related geometric queries.
Abstract
We consider the $(1+\varepsilon)$-Approximate Nearest Neighbour (ANN) Problem for polygonal curves in $d$-dimensional space under the Fréchet distance and ask to what extent known data structures for doubling spaces can be applied to this problem. Initially, this approach does not seem viable, since the doubling dimension of the target space is known to be unbounded -- even for well-behaved polygonal curves of constant complexity in one dimension. In order to overcome this, we identify a subspace of curves which has bounded doubling dimension and small Gromov-Hausdorff distance to the target space. We then apply state-of-the-art techniques for doubling spaces and show how to obtain a data structure for the $(1+\varepsilon)$-ANN problem for any set of parametrized polygonal curves. The expected preprocessing time needed to construct the data-structure is $F(d,k,S,\varepsilon)n\log n$ and the space used is $F(d,k,S,\varepsilon)n$, with a query time of $F(d,k,S,\varepsilon)\log n + F(d,k,S,\varepsilon)^{-\log(\varepsilon)}$, where $F(d,k,S,\varepsilon)=O\left(2^{O(d)}kΦ(S)\varepsilon^{-1}\right)^k$ and $Φ(S)$ denotes the spread of the set of vertices and edges of the curves in $S$. We extend these results to the realistic class of $c$-packed curves and show improved bounds for small values of $c$.
