Efficient Algorithms for Top-k Stabbing Queries on Weighted Interval Data (Full Version)
Daichi Amagata, Junya Yamada, Yuchen Ji, Takahiro Hara
TL;DR
This paper tackles the problem of processing top-k weighted stabbing queries on static weighted intervals. It introduces two exact algorithms: Interval Forest, which achieves $O(\sqrt{n}\log n + k)$ query time with $O(n)$ space, and a Segment Tree variant ST-PSA, which achieves $O(\log n + k)$ query time with $O(n\log n\log\log n)$ preprocessing and $O(n\log^2 n)$ space. The authors prove theoretical guarantees and validate them through experiments on two large real datasets, demonstrating clear speedups over the prior state of the art. The results have practical relevance for large-scale interval data in domains such as finance and transportation, and point to future work on dynamic intervals and continuous top-k queries.
Abstract
Intervals have been generated in many applications (e.g., temporal databases), and they are often associated with weights, such as prices. This paper addresses the problem of processing top-k weighted stabbing queries on interval data. Given a set of weighted intervals, a query value, and a result size $k$, this problem finds the $k$ intervals that are stabbed by the query value and have the largest weights. Although this problem finds practical applications (e.g., purchase, vehicle, and cryptocurrency analysis), it has not been well studied. A state-of-the-art algorithm for this problem incurs $O(n\log k)$ time, where $n$ is the number of intervals, so it is not scalable to large $n$. We solve this inefficiency issue and propose an algorithm that runs in $O(\sqrt{n }\log n + k)$ time. Furthermore, we propose an $O(\log n + k)$ algorithm to further accelerate the search efficiency. Experiments on two real large datasets demonstrate that our algorithms are faster than existing algorithms.
