Table of Contents
Fetching ...

BLI: A High-performance Bucket-based Learned Index with Concurrency Support

Huibing Dong, Wenlong Wang, Chun Liu, David Du

TL;DR

The paper addresses throughput and concurrency bottlenecks in learned indexes for in-memory key-value stores caused by strict-order insertions. It proposes a Bucket-based Learned Index (BLI) with a 'globally sorted, locally unsorted' design using D-Buckets for data and S-Buckets for segment structure, augmented by hint-assisted operations to accelerate inserts and lookups. BLI supports lock-free concurrency via valid bits and Read-Copy-Update, and employs bottom-up bulk loading with adaptive SMOs to maintain performance. Empirical results show up to 2.21x throughput improvements over state-of-the-art learned indexes, with up to 3.91x gains under multi-threaded workloads.

Abstract

Learned indexes are promising to replace traditional tree-based indexes. They typically employ machine learning models to efficiently predict target positions in strictly sorted linear arrays. However, the strict sorted order 1) significantly increases insertion overhead, 2) makes it challenging to support lock-free concurrency, and 3) harms in-node lookup/insertion efficiency due to model inaccuracy.\ In this paper, we introduce a \textit{Bucket-based Learned Index (BLI)}, which is an updatable in-memory learned index that adopts a "globally sorted, locally unsorted" approach by replacing linear sorted arrays with \textit{Buckets}. BLI optimizes the insertion throughput by only sorting Buckets, not the key-value pairs within a Bucket. BLI strategically balances three critical performance metrics: tree fanouts, lookup/insert latency for inner nodes, lookup/insert latency for leaf nodes, and memory consumption. To minimize maintenance costs, BLI performs lightweight bulk loading, insert, node scaling, node split, model retraining, and node merging adaptively. BLI supports lock-free concurrency thanks to the unsorted design with Buckets. Our results show that BLI achieves up to 2.21x better throughput than state-of-the-art learned indexes, with up to 3.91x gains under multi-threaded conditions.

BLI: A High-performance Bucket-based Learned Index with Concurrency Support

TL;DR

The paper addresses throughput and concurrency bottlenecks in learned indexes for in-memory key-value stores caused by strict-order insertions. It proposes a Bucket-based Learned Index (BLI) with a 'globally sorted, locally unsorted' design using D-Buckets for data and S-Buckets for segment structure, augmented by hint-assisted operations to accelerate inserts and lookups. BLI supports lock-free concurrency via valid bits and Read-Copy-Update, and employs bottom-up bulk loading with adaptive SMOs to maintain performance. Empirical results show up to 2.21x throughput improvements over state-of-the-art learned indexes, with up to 3.91x gains under multi-threaded workloads.

Abstract

Learned indexes are promising to replace traditional tree-based indexes. They typically employ machine learning models to efficiently predict target positions in strictly sorted linear arrays. However, the strict sorted order 1) significantly increases insertion overhead, 2) makes it challenging to support lock-free concurrency, and 3) harms in-node lookup/insertion efficiency due to model inaccuracy.\ In this paper, we introduce a \textit{Bucket-based Learned Index (BLI)}, which is an updatable in-memory learned index that adopts a "globally sorted, locally unsorted" approach by replacing linear sorted arrays with \textit{Buckets}. BLI optimizes the insertion throughput by only sorting Buckets, not the key-value pairs within a Bucket. BLI strategically balances three critical performance metrics: tree fanouts, lookup/insert latency for inner nodes, lookup/insert latency for leaf nodes, and memory consumption. To minimize maintenance costs, BLI performs lightweight bulk loading, insert, node scaling, node split, model retraining, and node merging adaptively. BLI supports lock-free concurrency thanks to the unsorted design with Buckets. Our results show that BLI achieves up to 2.21x better throughput than state-of-the-art learned indexes, with up to 3.91x gains under multi-threaded conditions.

Paper Structure

This paper contains 46 sections, 1 equation, 9 figures, 2 tables, 3 algorithms.

Figures (9)

  • Figure 1: The average prediction error with different group sizes on the first 10000 keys of books, fb, and osm.
  • Figure 2: An example of the BLI architecture, illustrating a three-level Segment and the bottom-level D-Bucket structure, employing the MOD hash as a hint function.
  • Figure 3: Structure of an S-Bucket
  • Figure 4: Throughput under different hint choices with the read-only workload.
  • Figure 5: The RCU process for D-Bucket splits in BLI.
  • ...and 4 more figures