Concurrent aggregate queries

Gal Sela; Erez Petrank

Concurrent aggregate queries

Gal Sela, Erez Petrank

TL;DR

The paper tackles the problem of supporting efficient aggregate queries on concurrent trees, formalizing aggregate metadata and additive/subtractive functions to enable root-to-leaf traversal answers without scanning ranges. It introduces a design based on multi-versioning and operation announcements, and presents two implementations: FastUpdateTree (optimizes update-time) and FastQueryTree (optimizes aggregate-query time), both ensuring linearizability and providing a detailed correctness and complexity analysis. The work situates itself among prior concurrent-query literature and offers a concrete design with per-thread metadata or serialized updates, showing a clear trade-off between tree-update performance and query efficiency. It also discusses extensions, potential optimizations, and directions toward generalization to other tree families and more complex aggregate queries, highlighting practical impact for concurrent data-structure libraries and multi-core applications.

Abstract

Concurrent data structures serve as fundamental building blocks for concurrent computing. Many concurrent counterparts have been designed for basic sequential mechanisms; however, one notable omission is a concurrent tree that supports aggregate queries. Aggregate queries essentially compile succinct information about a range of data items, for example, calculating the average salary of employees in their 30s. Such queries play an essential role in various applications and are commonly taught in undergraduate data structures courses. In this paper, we formalize a type of aggregate queries that can be efficiently supported by concurrent trees and present a design for implementing these queries on concurrent trees. We bring two algorithms implementing this design, where one optimizes for tree update time, while the other optimizes for aggregate query time. We analyze their correctness and complexity, demonstrating the trade-offs between query time and update time.

Concurrent aggregate queries

TL;DR

Abstract

Paper Structure (33 sections, 1 theorem, 4 equations, 7 figures)

This paper contains 33 sections, 1 theorem, 4 equations, 7 figures.

Introduction
Aggregate metadata and aggregate queries
Aggregate functions used for metadata
Aggregate queries
Terminology
The design
Design overview
The two algorithms
The base tree
Design backbone details
insert and delete operations
Aggregate queries
contains operation
isDeleted and getValueIfInserted auxiliary methods
Analysis
...and 18 more sections

Key Result

Lemma 7

The linearization points of contains and failing insert and delete operations as defined in section: linearization points are well defined. Namely, a moment as described in them indeed occurs within $op$'s interval.

Figures (7)

Figure 1: Template for a basic aggregate query
Figure 2: Fields of the node classes
Figure 3: Fields of the Update class
Figure 4: The VersionedField class
Figure 5: Pseudocode for the isDeleted method
...and 2 more figures

Theorems & Definitions (6)

Definition 1: additive aggregate function
Definition 2: additive aggregate function -- alternative
Definition 4: subtractive aggregate function
Definition 5: subtractive aggregate function -- alternative
Lemma 7
Claim 8

Concurrent aggregate queries

TL;DR

Abstract

Concurrent aggregate queries

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (6)