Succinct Data Structures for Segments
Philip Bille, Inge Li Gørtz, Simon R. Tarnow
TL;DR
This work presents a succinct representation for a set of $n$ horizontal line segments in rank space that supports segment-access, segment-select, and segment-rank queries with space $2n\\lg n + O(n\\lg n / \\lg\\lg n)$ bits and query time $O(\\lg n / \\lg\\lg n)$. The core technique is a segment wavelet tree, and its \\Delta-ary slab generalization, enabling both concise encoding and sublogarithmic query times. The authors establish tight lower bounds: representing the segments requires at least $2n\\lg n - O(n)$ bits, and segment-rank cannot be faster than \\Omega(\\lg n / \\lg\\lg n)$ under common static-model assumptions, via reductions from 2D counting. The results advance succinct data structures for geometric queries and have implications for compressed representations of persistent strings, offering a practical, independent data structure with strong theoretical guarantees.
Abstract
We consider succinct data structures for representing a set of $n$ horizontal line segments in the plane given in rank space to support \emph{segment access}, \emph{segment selection}, and \emph{segment rank} queries. A segment access query finds the segment $(x_1, x_2, y)$ given its $y$-coordinate ($y$-coordinates of the segments are distinct), a segment selection query finds the $j$th smallest segment (the segment with the $j$th smallest $y$-coordinate) among the segments crossing the vertical line for a given $x$-coordinate, and a segment rank query finds the number of segments crossing the vertical line through $x$-coordinate $i$ with $y$-coordinate at most $y$, for a given $x$ and $y$. This problem is a central component in compressed data structures for persistent strings supporting random access. Our main result is data structure using $2n\lg{n} + O(n\lg{n}/\lg{\lg{n}})$ bits of space and $O(\lg{n}/\lg{\lg{n}})$ query time for all operations. We show that this space bound is optimal up to lower-order terms. We will also show that the query time for segment rank is optimal. The query time for segment selection is also optimal by a previous bound. To obtain our results, we present a novel segment wavelet tree data structure of independent interest. This structure is inspired by and extends the classic wavelet tree for sequences. This leads to a simple, succinct solution with $O(\log n)$ query times. We then extend this solution to obtain optimal query time. Our space lower bound follows from a simple counting argument, and our lower bound for segment rank is obtained by a reduction from 2-dimensional counting.
