Table of Contents
Fetching ...

Work-Efficient Query Evaluation with PRAMs

Jens Keppeler, Thomas Schwentick, Christopher Spinrath

TL;DR

This work investigates work-efficient constant-time parallel query evaluation on CRCW PRAMs, addressing the fundamental challenge that although relational algebra can be evaluated in constant time, output representation and deduplication can incur large work costs. The authors develop a framework combining approximate prefix sums, approximate compaction, and padded sorting to enable weakly work-efficient constant-time algorithms across several settings. They prove strong results for dictionary-based representations: acyclic join queries and free-connex acyclic joins achieve near worst-case-optimal work bounds, while semijoin algebra queries admit work-optimal evaluation, and joins can reach worst-case-lean bounds in a constant-time regime. They further extend the results to ordered and general settings via reductions to the dictionary setting, and outline several open questions, including dynamic maintenance and potential PANDA-driven improvements, highlighting the practical significance of work-efficient constant-time parallel query processing.

Abstract

The article studies query evaluation in parallel constant time in the CRCW PRAM model. While it is well-known that all relational algebra queries can be evaluated in constant time on an appropriate CRCW PRAM model, this article is interested in the efficiency of evaluation algorithms, that is, in the number of processors or, asymptotically equivalent, in the work. Naive evaluation in the parallel setting results in huge (polynomial) bounds on the work of such algorithms and in presentations of the result sets that can be extremely scattered in memory. The article discusses some obstacles for constant-time PRAM query evaluation. It presents algorithms for relational operators and explores three settings, in which efficient sequential query evaluation algorithms exist: acyclic queries, semijoin algebra queries, and join queries -- the latter in the worst-case optimal framework. Under mild assumptions -- that data values are numbers of polynomial size in the size of the database or that the relations of the database are suitably sorted -- constant-time algorithms are presented that are weakly work-efficient in the sense that work $\mathcal{O}(T^{1+\varepsilon})$ can be achieved, for every $\varepsilon>0$, compared to the time $T$ of an optimal sequential algorithm. Important tools are the algorithms for approximate prefix sums and compaction from Goldberg and Zwick (1995).

Work-Efficient Query Evaluation with PRAMs

TL;DR

This work investigates work-efficient constant-time parallel query evaluation on CRCW PRAMs, addressing the fundamental challenge that although relational algebra can be evaluated in constant time, output representation and deduplication can incur large work costs. The authors develop a framework combining approximate prefix sums, approximate compaction, and padded sorting to enable weakly work-efficient constant-time algorithms across several settings. They prove strong results for dictionary-based representations: acyclic join queries and free-connex acyclic joins achieve near worst-case-optimal work bounds, while semijoin algebra queries admit work-optimal evaluation, and joins can reach worst-case-lean bounds in a constant-time regime. They further extend the results to ordered and general settings via reductions to the dictionary setting, and outline several open questions, including dynamic maintenance and potential PANDA-driven improvements, highlighting the practical significance of work-efficient constant-time parallel query processing.

Abstract

The article studies query evaluation in parallel constant time in the CRCW PRAM model. While it is well-known that all relational algebra queries can be evaluated in constant time on an appropriate CRCW PRAM model, this article is interested in the efficiency of evaluation algorithms, that is, in the number of processors or, asymptotically equivalent, in the work. Naive evaluation in the parallel setting results in huge (polynomial) bounds on the work of such algorithms and in presentations of the result sets that can be extremely scattered in memory. The article discusses some obstacles for constant-time PRAM query evaluation. It presents algorithms for relational operators and explores three settings, in which efficient sequential query evaluation algorithms exist: acyclic queries, semijoin algebra queries, and join queries -- the latter in the worst-case optimal framework. Under mild assumptions -- that data values are numbers of polynomial size in the size of the database or that the relations of the database are suitably sorted -- constant-time algorithms are presented that are weakly work-efficient in the sense that work can be achieved, for every , compared to the time of an optimal sequential algorithm. Important tools are the algorithms for approximate prefix sums and compaction from Goldberg and Zwick (1995).
Paper Structure (28 sections, 47 theorems, 30 equations, 1 figure, 4 tables)

This paper contains 28 sections, 47 theorems, 30 equations, 1 figure, 4 tables.

Key Result

Proposition 2.1

Any algorithm that computes the parity function of $n$ variables and uses a polynomial number of processors requires $\Omega\left(\frac{\log{n}}{\log{\log{n}}}\right)$ time on a Priority CRCW PRAM.

Theorems & Definitions (80)

  • Example 1.1
  • Proposition 2.1: DBLP:books/aw/JaJa92
  • Corollary 2.2
  • Proposition 2.3: DBLP:journals/iandc/Chaudhuri96
  • Corollary 2.4
  • proof
  • Proposition 2.5: GoldbergZ95
  • Proposition 2.5: GoldbergZ95
  • Lemma 2.6
  • proof
  • ...and 70 more