Table of Contents
Fetching ...

MementoHash: A Stateful, Minimal Memory, Best Performing Consistent Hash Algorithm

Massimo Coluzzi, Amos Brocco, Alessandro Antonucci, Tiziano Leidi

TL;DR

MementoHash extends JumpHash to handle random bucket failures without fixing the cluster capacity, by storing only the history of removals in a compact replacement set and tracking the last removed bucket. It preserves Jump's fast, memory-light lookup while ensuring balance and minimal disruption through a dense b-array representation and a chain of replacements that resolves to a working bucket. The approach achieves near-optimal practical performance and significantly reduced memory usage compared to AnchorHash and DxHash, especially under typical failure rates, and scales indefinitely since no fixed capacity bound is required. Empirical benchmarks demonstrate that MementoHash matches or exceeds JumpHash in best- and average-case scenarios, while offering robust random-failure handling with competitive memory and lookup characteristics across three evaluation setups.

Abstract

Consistent hashing is used in distributed systems and networking applications to spread data evenly and efficiently across a cluster of nodes. In this paper, we present MementoHash, a novel consistent hashing algorithm that eliminates known limitations of state-of-the-art algorithms while keeping optimal performance and minimal memory usage. We describe the algorithm in detail, provide a pseudo-code implementation, and formally establish its solid theoretical guarantees. To measure the efficacy of MementoHash, we compare its performance, in terms of memory usage and lookup time, to that of state-of-the-art algorithms, namely, AnchorHash, DxHash, and JumpHash. Unlike JumpHash, MementoHash can handle random failures. Moreover, MementoHash does not require fixing the overall capacity of the cluster (as AnchorHash and DxHash do), allowing it to scale indefinitely. The number of removed nodes affects the performance of all the considered algorithms. Therefore, we conduct experiments considering three different scenarios: stable (no removed nodes), one-shot removals (90% of the nodes removed at once), and incremental removals. We report experimental results that averaged a varying number of nodes from ten to one million. Results indicate that our algorithm shows optimal lookup performance and minimal memory usage in its best-case scenario. It behaves better than AnchorHash and DxHash in its average-case scenario and at least as well as those two algorithms in its worst-case scenario. However, the worst-case scenario for MementoHash occurs when more than 70% of the nodes fail, which describes a unlikely scenario. Therefore, MementoHash shows the best performance during the regular life cycle of a cluster.

MementoHash: A Stateful, Minimal Memory, Best Performing Consistent Hash Algorithm

TL;DR

MementoHash extends JumpHash to handle random bucket failures without fixing the cluster capacity, by storing only the history of removals in a compact replacement set and tracking the last removed bucket. It preserves Jump's fast, memory-light lookup while ensuring balance and minimal disruption through a dense b-array representation and a chain of replacements that resolves to a working bucket. The approach achieves near-optimal practical performance and significantly reduced memory usage compared to AnchorHash and DxHash, especially under typical failure rates, and scales indefinitely since no fixed capacity bound is required. Empirical benchmarks demonstrate that MementoHash matches or exceeds JumpHash in best- and average-case scenarios, while offering robust random-failure handling with competitive memory and lookup characteristics across three evaluation setups.

Abstract

Consistent hashing is used in distributed systems and networking applications to spread data evenly and efficiently across a cluster of nodes. In this paper, we present MementoHash, a novel consistent hashing algorithm that eliminates known limitations of state-of-the-art algorithms while keeping optimal performance and minimal memory usage. We describe the algorithm in detail, provide a pseudo-code implementation, and formally establish its solid theoretical guarantees. To measure the efficacy of MementoHash, we compare its performance, in terms of memory usage and lookup time, to that of state-of-the-art algorithms, namely, AnchorHash, DxHash, and JumpHash. Unlike JumpHash, MementoHash can handle random failures. Moreover, MementoHash does not require fixing the overall capacity of the cluster (as AnchorHash and DxHash do), allowing it to scale indefinitely. The number of removed nodes affects the performance of all the considered algorithms. Therefore, we conduct experiments considering three different scenarios: stable (no removed nodes), one-shot removals (90% of the nodes removed at once), and incremental removals. We report experimental results that averaged a varying number of nodes from ten to one million. Results indicate that our algorithm shows optimal lookup performance and minimal memory usage in its best-case scenario. It behaves better than AnchorHash and DxHash in its average-case scenario and at least as well as those two algorithms in its worst-case scenario. However, the worst-case scenario for MementoHash occurs when more than 70% of the nodes fail, which describes a unlikely scenario. Therefore, MementoHash shows the best performance during the regular life cycle of a cluster.
Paper Structure (32 sections, 28 equations, 32 figures, 1 table, 4 algorithms)

This paper contains 32 sections, 28 equations, 32 figures, 1 table, 4 algorithms.

Figures (32)

  • Figure 1: Jump's representation of a cluster
  • Figure 2: Jump Hash: Adding two buckets
  • Figure 3: Jump Hash: Removing two buckets
  • Figure 4: Cluster representation for Anchor
  • Figure 5: Lookup process of Dx
  • ...and 27 more figures

Theorems & Definitions (7)

  • proof
  • proof
  • proof
  • proof
  • proof
  • proof
  • proof