Table of Contents
Fetching ...

On strictly output sensitive color frequency reporting

Erwin Glazenburg, Frank Staals

Abstract

Given a set of $n$ colored points $P \subset \mathbb{R}^d$ we wish to store $P$ such that, given some query region $Q$, we can efficiently report the colors of the points appearing in the query region, along with their frequencies. This is the \emph{color frequency reporting} problem. We study the case where query regions $Q$ are axis-aligned boxes or dominance ranges. If $Q$ contains $k$ colors, the main goal is to achieve ``strictly output sensitive'' query time $O(f(n) + k)$. Firstly, we show that, for every $s \in \{ 2, \dots, n \}$, there exists a simple $O(ns\log_s n)$ size data structure for points in $\mathbb{R}^2$ that allows frequency reporting queries in $O(\log n + k\log_s n)$ time. Secondly, we give a lower bound for the weighted version of the problem in the arithmetic model of computation, proving that with $O(m)$ space one can not achieve query times better than $Ω\left(φ\frac{\log (n / φ)}{\log (m / n)}\right)$, where $φ$ is the number of possible colors. This means that our data structure is near-optimal. We extend these results to higher dimensions as well. Thirdly, we present a transformation that allows us to reduce the space usage of the aforementioned datastructure to $O(n(s φ)^\varepsilon \log_s n)$. Finally, we give an $O(n^{1+\varepsilon} + m \log n + K)$-time algorithm that can answer $m$ dominance queries $\mathbb{R}^2$ with total output complexity $K$, while using only linear working space.

On strictly output sensitive color frequency reporting

Abstract

Given a set of colored points we wish to store such that, given some query region , we can efficiently report the colors of the points appearing in the query region, along with their frequencies. This is the \emph{color frequency reporting} problem. We study the case where query regions are axis-aligned boxes or dominance ranges. If contains colors, the main goal is to achieve ``strictly output sensitive'' query time . Firstly, we show that, for every , there exists a simple size data structure for points in that allows frequency reporting queries in time. Secondly, we give a lower bound for the weighted version of the problem in the arithmetic model of computation, proving that with space one can not achieve query times better than , where is the number of possible colors. This means that our data structure is near-optimal. We extend these results to higher dimensions as well. Thirdly, we present a transformation that allows us to reduce the space usage of the aforementioned datastructure to . Finally, we give an -time algorithm that can answer dominance queries with total output complexity , while using only linear working space.
Paper Structure (13 sections, 8 theorems, 4 figures)

This paper contains 13 sections, 8 theorems, 4 figures.

Key Result

Lemma 1

Let $P$ be a set of $n$ points in $\mathbb{R}^1$ with colors from $\{1 \dots \phi \}$. In $O(n\log n)$ time we can build an $O(n)$ space data structure, so that we can answer dominance color frequency reporting queries using a binary search and $O(k)$ additional time.

Figures (4)

  • Figure 1: A set of colored points $P$ in $\mathbb{R}^2$ and a query box $Q$ with color frequencies (red, 4) and (blue, 3). There are no green or purple points in $Q$, hence they do not appear in the output.
  • Figure 2: A set of points and three dominance queries. In the offline version, we would want to answer all three queries at once.
  • Figure 3: From left to right: $2$-, $3$-, and $4$-sided boxes in $\mathbb{R}^2$, and a $5$-sided box in $\mathbb{R}^3$.
  • Figure 6: A 3-sided query $q$ at node $u$, with $x_u$ shown.

Theorems & Definitions (9)

  • Lemma 1
  • Lemma 2
  • Theorem 3
  • Lemma 3
  • Remark 4
  • Theorem 5
  • Lemma 6: Chazelle's uncolored lower bound (chazelle1990ArithmeticLowerBound, chazelle1990ArithmeticLowerBound, Theorem 2.1)
  • Lemma 7
  • Lemma 8