The AHA-Tree: An Adaptive Index for HTAP Workloads
Lu Xing, Walid G. Aref
TL;DR
The paper addresses HTAP workloads that oscillate between heavy writes and analytics queries, where static indexes underperform or require downtime for reorganization. It introduces the AHA-tree, an adaptive index that fuses an $LSM$-Tree–style buffering layer with a $B^{+}$-Tree–like leaf region to morph between write-optimized and read-optimized modes while preserving concurrency and avoiding downtime. The key contributions include a concrete architectural design with a root and leaf LSM buffering, a hotspot-aware adaptation mechanism that flushes hotspot data toward leaf pages, and a set of knobs (e.g., lazy vs eager adaptation, batched vs single inserts, leaf transformation strategies) for exploring trade-offs; demonstrations contrast throughput against baseline indexes and illustrate intermediate adaptive states. The work demonstrates that the AHA-tree can maintain competitive throughput during dynamic workloads and offers practical mechanisms to minimize downtime and data migrations in HTAP environments.
Abstract
In this demo, we realize data indexes that can morph from being write-optimized at times to being read-optimized at other times nonstop with zero-down time during the workload transitioning. These data indexes are useful for HTAP systems (Hybrid Transactional and Analytical Processing Systems), where transactional workloads are write-heavy while analytical workloads are read-heavy. Traditional indexes, e.g., B+-tree and LSM-Tree, although optimized for one kind of workload, cannot perform equally well under all workloads. To migrate from the write-optimized LSM-Tree to a read-optimized B+-tree is costly and mandates some system down time to reorganize data. We design adaptive indexes that can dynamically morph from a pure LSM-tree to a pure buffered B-tree back and forth, and has interesting states in-between. There are two challenges: allowing concurrent operations and avoiding system down time. This demo benchmarks the proposed AHA-Tree index under dynamic workloads and shows how the index evolves from one state to another without blocking.
