Table of Contents
Fetching ...

FlipHash: A Constant-Time Consistent Range-Hashing Algorithm

Charles Masson, Homin K. Lee

TL;DR

FlipHash tackles the problem of consistent range-hashing for sequentially indexed resources, seeking minimal data reshuffling when resources are added and ensuring fast key hashing. The approach constructs a ranged hash from a hash family with independence properties and a flip operation to guarantee monotonicity and near-uniform distribution. The paper proves monotonicity and regularity, and shows constant-time average hashing with constant memory, plus a generalization to any n up to 2^q. Empirical evaluations and implementations demonstrate favorable performance over JumpHash and competitive results against broader consistent hashing schemes, with practical applicability to database sharding and distributed storage.

Abstract

Consistent range-hashing is a technique used in distributed systems, either directly or as a subroutine for consistent hashing, commonly to realize an even and stable data distribution over a variable number of resources. We introduce FlipHash, a consistent range-hashing algorithm with constant time complexity and low memory requirements. Like Jump Consistent Hash, FlipHash is intended for applications where resources can be indexed sequentially. Under this condition, it ensures that keys are hashed evenly across resources and that changing the number of resources only causes keys to be remapped from a removed resource or to an added one, but never shuffled across persisted ones. FlipHash differentiates itself with its low computational cost, achieving constant-time complexity. We show that FlipHash beats Jump Consistent Hash's cost, which is logarithmic in the number of resources, both theoretically and in experiments over practical settings.

FlipHash: A Constant-Time Consistent Range-Hashing Algorithm

TL;DR

FlipHash tackles the problem of consistent range-hashing for sequentially indexed resources, seeking minimal data reshuffling when resources are added and ensuring fast key hashing. The approach constructs a ranged hash from a hash family with independence properties and a flip operation to guarantee monotonicity and near-uniform distribution. The paper proves monotonicity and regularity, and shows constant-time average hashing with constant memory, plus a generalization to any n up to 2^q. Empirical evaluations and implementations demonstrate favorable performance over JumpHash and competitive results against broader consistent hashing schemes, with practical applicability to database sharding and distributed storage.

Abstract

Consistent range-hashing is a technique used in distributed systems, either directly or as a subroutine for consistent hashing, commonly to realize an even and stable data distribution over a variable number of resources. We introduce FlipHash, a consistent range-hashing algorithm with constant time complexity and low memory requirements. Like Jump Consistent Hash, FlipHash is intended for applications where resources can be indexed sequentially. Under this condition, it ensures that keys are hashed evenly across resources and that changing the number of resources only causes keys to be remapped from a removed resource or to an added one, but never shuffled across persisted ones. FlipHash differentiates itself with its low computational cost, achieving constant-time complexity. We show that FlipHash beats Jump Consistent Hash's cost, which is logarithmic in the number of resources, both theoretically and in experiments over practical settings.
Paper Structure (12 sections, 6 theorems, 5 equations, 3 figures, 4 tables, 1 algorithm)

This paper contains 12 sections, 6 theorems, 5 equations, 3 figures, 4 tables, 1 algorithm.

Key Result

theorem 1

$\tilde{f}$ is monotone.

Figures (3)

  • Figure 1: Average (line) and interdecile (filled area) evaluation wall times of FlipHash and JumpHash.
  • Figure 2: Average (line) and interdecile (filled area) evaluation wall times of FlipHash and JumpHash for small numbers of resources.
  • Figure 3: Measure of the uniformity of the key distribution across 1000 resources of FlipHash and JumpHash, with randomly generated keys.

Theorems & Definitions (9)

  • definition 1: monotonicity
  • definition 2: $\epsilon$-regularity
  • definition 3: $(\epsilon, k)$-wise independence
  • theorem 1
  • lemma 1
  • corollary 1
  • theorem 2
  • theorem 3
  • theorem 4