Autumn: A Scalable Read Optimized LSM-tree based Key-Value Stores with Fast Point and Range Read Speed
Fuheng Zhao, Zach Miller, Leron Reznikov, Divyakant Agrawal, Amr El Abbadi
TL;DR
Autumn introduces Garnering, a read-optimized extension to LSM-tree stores that dynamically adjusts inter-level capacity ratios to improve point and range reads. The key idea is to fix the last-level ratio while scaling lower-level gaps by a factor $c<1$, resulting in a level count of $L=O(\sqrt{-\log_{c}(\frac{N}{B\cdot T})})$ and worst-case read costs around $O(\sqrt{-\log_{c}(\frac{N}{B\cdot T})})$, i.e., $O(\sqrt{\log N})$, with or without Bloom filters. Autumn also employs a delayed last-level compaction strategy and DRAM pinning for Level 0 to curb write amplification, and it leverages Bloom-filter optimization to minimize point-read I/Os. Empirical results on LevelDB/RocksDB show substantial read-speed gains with modest or comparable write amplification, validating Autumn’s applicability to OLTP and HTAP workloads. The work provides a principled framework for balancing read and write costs in LSM-trees, supported by theoretical analyses and substantial benchmarking.
Abstract
The Log Structured Merge Trees (LSM-tree) based key-value stores are widely used in many storage systems to support a variety of operations such as updates, point reads, and range reads. Traditionally, LSM-tree's merge policy organizes data into multiple levels of exponentially increasing capacity to support high-speed writes. However, we contend that the traditional merge policies are not optimized for reads. In this work, we present Autumn, a scalable and read optimized LSM-tree based key-value stores with minimal point and range read cost. The key idea in improving the read performance is to dynamically adjust the capacity ratio between two adjacent levels as more data are stored. As a result, smaller levels gradually increase their capacities and merge more often. In particular, the point and range read cost improves from the previous best known $O(logN)$ complexity to $O(\sqrt{logN})$ in Autumn by applying the novel Garnering merge policy. While Garnering merge policy optimizes for both point reads and range reads, it maintains high performance for updates. Moreover, to further improve the update costs, Autumn uses a small amount of bounded space of DRAM to pin/keep the first level of LSM-tree. We implemented Autumn on top of LevelDB and experimentally showcases the gain in performance for real world workloads.
