[Extended Version] ArceKV: Towards Workload-driven LSM-compactions for Key-Value Store Under Dynamic Workloads
Junfeng Liu, Haoxuan Xie, Siqiang Luo
TL;DR
This work addresses the instability of LSM-tree–based KV stores under dynamic workloads by proposing ElasticLSM, which expands the action space beyond fixed level capacities, and Arce, a lightweight decision engine that scores compaction and write-stall actions. Built atop RocksDB as ArceKV, the system jointly optimizes compactions and stall thresholds via a windowed-cost model and a simulation-guided parameter search, reducing transition overhead while maintaining high read/write performance. Theoretical contributions include showing NP-hardness of the optimal compaction sequence and introducing dominating-compaction concepts to prune actions, complemented by empirical validation across diverse workloads and industrial baselines. The results demonstrate substantial gains in throughput and latency stability, with ArceKV outperforming state-of-the-art policies and comparable systems in dynamic scenarios, making it a practical approach for workload-driven KV stores.
Abstract
Key-value stores underpin a wide range of applications due to their simplicity and efficiency. Log-Structured Merge Trees (LSM-trees) dominate as their underlying structure, excelling at handling rapidly growing data. Recent research has focused on optimizing LSM-tree performance under static workloads with fixed read-write ratios. However, real-world workloads are highly dynamic, and existing workload-aware approaches often struggle to sustain optimal performance or incur substantial transition overhead when workload patterns shift. To address this, we propose ElasticLSM, which removes traditional LSM-tree structural constraints to allow more flexible management actions (i.e., compactions and write stalls) creating greater opportunities for continuous performance optimization. We further design Arce, a lightweight compaction decision engine that guides ElasticLSM in selecting the optimal action from its expanded action space. Building on these components, we implement ArceKV, a full-fledged key-value store atop RocksDB. Extensive evaluations demonstrate that ArceKV outperforms state-of-the-art compaction strategies across diverse workloads, delivering around 3x faster performance in dynamic scenarios.
