vLSM: Low tail latency and I/O amplification in LSM-based KV stores
Giorgos Xanthakis, Antonios Katsarakis, Giorgos Saloustros, Angelos Bilas
TL;DR
vLSM targets the persistent tail latency problem in LSM-based KV stores by modeling tail latency through compaction chains and proposing a design that reduces both chain width and length without inflating I/O amplification or memory. It eliminates tiering in L0, uses smaller SSTs, expands the L1–L2 growth factor to Phi, and introduces overlap-aware vSSTs to contain merge amplification. Experimental results show up to 4.8x improvements in P99 write latency and up to 12.5x in reads, with substantially fewer write stalls and no material increase in I/O amplification at a similar memory footprint. The approach offers a practical path to lower tail latency in production KV stores while maintaining efficiency across memory and I/O constraints, making it suitable for latency-sensitive applications.
Abstract
LSM-based key-value (KV) stores are an important component in modern data infrastructures. However, they suffer from high tail latency, in the order of several seconds, making them less attractive for user-facing applications. In this paper, we introduce the notion of compaction chains and we analyse how they affect tail latency. Then, we show that modern designs reduce tail latency, by trading I/O amplification or require large amounts of memory. Based on our analysis, we present vLSM, a new KV store design that improves tail latency significantly without compromising on memory or I/O amplification. vLSM reduces (a) compaction chain width by using small SSTs and eliminating the tiering compaction required in L0 by modern systems and (b) compaction chain length by using a larger than typical growth factor between L1 and L2 and introducing overlap-aware SSTs in L1. We implement vLSM in RocksDB and evaluate it using db_bench and YCSB. Our evaluation highlights the underlying trade-off among memory requirements, I/O amplification, and tail latency, as well as the advantage of vLSM over current approaches. vLSM improves P99 tail latency by up to 4.8x for writes and by up to 12.5x for reads, reduces cumulative write stalls by up to 60% while also slightly improves I/O amplification at the same memory budget.
