Enhancing Computation Pushdown for Cloud OLAP Databases
Yifei Yang, Xiangyao Yu, Marco Serafini, Ashraf Aboulnaga, Michael Stonebraker
TL;DR
This work tackles the network bottleneck in storage-disaggregation cloud OLAP systems by proposing Adaptive pushdown, a runtime mechanism that uses a pushback arbiter to decide whether a pushdown task should execute in storage or be processed at the computation layer. It provides a general principle for identifying pushdown-amenable operators and introduces two new pushdown operators—Selection Bitmap and Distributed Data Shuffle—that further enhance performance. The evaluation on TPC-H shows Adaptive pushdown achieving up to 1.9x speedup over baselines, while the new operators yield up to 3.0x improvements, demonstrating practical benefits for cloud OLAP workloads. The work offers actionable guidance for dynamic pushdown decisions and operator design in multi-tenant storage-disaggregated architectures, with FPDB as a concrete open-source platform.
Abstract
Network is a major bottleneck in modern cloud databases that adopt a storage-disaggregation architecture. Computation pushdown is a promising solution to tackle this issue, which offloads some computation tasks to the storage layer to reduce network traffic. Existing cloud OLAP systems statically decide whether to push down computation during the query optimization phase and do not consider the storage layer's computational capacity and load. Besides, there is a lack of a general principle that determines which operators are amenable for pushdown. Existing systems design and implement pushdown features empirically, which ends up picking a limited set of pushdown operators respectively. In this paper, we first design Adaptive pushdown as a new mechanism to avoid throttling the storage-layer computation during pushdown, which pushes the request back to the computation layer at runtime if the storage-layer computational resource is insufficient. Moreover, we derive a general principle to identify pushdown-amenable computational tasks, by summarizing common patterns of pushdown capabilities in existing systems. We propose two new pushdown operators, namely, selection bitmap and distributed data shuffle. Evaluation results on TPC-H show that Adaptive pushdown can achieve up to 1.9x speedup over both No pushdown and Eager pushdown baselines, and the new pushdown operators can further accelerate query execution by up to 3.0x.
