Table of Contents
Fetching ...

Succinct Data Structures for Segments

Philip Bille, Inge Li Gørtz, Simon R. Tarnow

TL;DR

This work presents a succinct representation for a set of $n$ horizontal line segments in rank space that supports segment-access, segment-select, and segment-rank queries with space $2n\\lg n + O(n\\lg n / \\lg\\lg n)$ bits and query time $O(\\lg n / \\lg\\lg n)$. The core technique is a segment wavelet tree, and its \\Delta-ary slab generalization, enabling both concise encoding and sublogarithmic query times. The authors establish tight lower bounds: representing the segments requires at least $2n\\lg n - O(n)$ bits, and segment-rank cannot be faster than \\Omega(\\lg n / \\lg\\lg n)$ under common static-model assumptions, via reductions from 2D counting. The results advance succinct data structures for geometric queries and have implications for compressed representations of persistent strings, offering a practical, independent data structure with strong theoretical guarantees.

Abstract

We consider succinct data structures for representing a set of $n$ horizontal line segments in the plane given in rank space to support \emph{segment access}, \emph{segment selection}, and \emph{segment rank} queries. A segment access query finds the segment $(x_1, x_2, y)$ given its $y$-coordinate ($y$-coordinates of the segments are distinct), a segment selection query finds the $j$th smallest segment (the segment with the $j$th smallest $y$-coordinate) among the segments crossing the vertical line for a given $x$-coordinate, and a segment rank query finds the number of segments crossing the vertical line through $x$-coordinate $i$ with $y$-coordinate at most $y$, for a given $x$ and $y$. This problem is a central component in compressed data structures for persistent strings supporting random access. Our main result is data structure using $2n\lg{n} + O(n\lg{n}/\lg{\lg{n}})$ bits of space and $O(\lg{n}/\lg{\lg{n}})$ query time for all operations. We show that this space bound is optimal up to lower-order terms. We will also show that the query time for segment rank is optimal. The query time for segment selection is also optimal by a previous bound. To obtain our results, we present a novel segment wavelet tree data structure of independent interest. This structure is inspired by and extends the classic wavelet tree for sequences. This leads to a simple, succinct solution with $O(\log n)$ query times. We then extend this solution to obtain optimal query time. Our space lower bound follows from a simple counting argument, and our lower bound for segment rank is obtained by a reduction from 2-dimensional counting.

Succinct Data Structures for Segments

TL;DR

This work presents a succinct representation for a set of horizontal line segments in rank space that supports segment-access, segment-select, and segment-rank queries with space bits and query time . The core technique is a segment wavelet tree, and its \\Delta-ary slab generalization, enabling both concise encoding and sublogarithmic query times. The authors establish tight lower bounds: representing the segments requires at least bits, and segment-rank cannot be faster than \\Omega(\\lg n / \\lg\\lg n)$ under common static-model assumptions, via reductions from 2D counting. The results advance succinct data structures for geometric queries and have implications for compressed representations of persistent strings, offering a practical, independent data structure with strong theoretical guarantees.

Abstract

We consider succinct data structures for representing a set of horizontal line segments in the plane given in rank space to support \emph{segment access}, \emph{segment selection}, and \emph{segment rank} queries. A segment access query finds the segment given its -coordinate (-coordinates of the segments are distinct), a segment selection query finds the th smallest segment (the segment with the th smallest -coordinate) among the segments crossing the vertical line for a given -coordinate, and a segment rank query finds the number of segments crossing the vertical line through -coordinate with -coordinate at most , for a given and . This problem is a central component in compressed data structures for persistent strings supporting random access. Our main result is data structure using bits of space and query time for all operations. We show that this space bound is optimal up to lower-order terms. We will also show that the query time for segment rank is optimal. The query time for segment selection is also optimal by a previous bound. To obtain our results, we present a novel segment wavelet tree data structure of independent interest. This structure is inspired by and extends the classic wavelet tree for sequences. This leads to a simple, succinct solution with query times. We then extend this solution to obtain optimal query time. Our space lower bound follows from a simple counting argument, and our lower bound for segment rank is obtained by a reduction from 2-dimensional counting.

Paper Structure

This paper contains 21 sections, 9 theorems, 11 equations, 1 figure.

Key Result

Theorem 1

Given a set of $n$ horizontal line segments, we can solve the segment representation problem using $2n\lg{n} + O(n\lg{n}/\lg{\lg{n}})$ bits of space and $O(\lg{n}/\lg{\lg{n}})$ time for all queries.

Figures (1)

  • Figure 1: The top 3 levels of the segment wavelet tree of the segments $\mathcal{L}$ and the computed local variables for the query $\textsf{segment-select}(7,2)$, where $v$ is root of the segment wavelet tree. For some of the nodes, the corresponding subproblem is visualized as a 2D plane, where empty columns have been removed. The bitvectors $B^L$ and $B^R$ of each node is horizontally spaced such that each bit vertically aligns with the endpoint it represents. The visited nodes in the query $\textsf{segment-select}(7,2)$ are marked with a red arrow together with the local variables. Furthermore, in the 2D plane of the visited nodes, the vertical line with $x$-coordinate $7$ is highlighted, and the prefix of the bitvectors $B^L$ and $B^R$ that correspond to the endpoints with $x$-coordinate at most $7$ are also highlighted.

Theorems & Definitions (10)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Lemma 1: DBLP:conf/focs/Jacobson89
  • Lemma 2
  • Lemma 3: DBLP:conf/esa/GolynskiGGRR07
  • Theorem 4
  • Lemma 4: DBLP:journals/mst/BilleG23
  • proof
  • Lemma 5: DBLP:conf/stoc/Patrascu07