FlipHash: A Constant-Time Consistent Range-Hashing Algorithm
Charles Masson, Homin K. Lee
TL;DR
FlipHash tackles the problem of consistent range-hashing for sequentially indexed resources, seeking minimal data reshuffling when resources are added and ensuring fast key hashing. The approach constructs a ranged hash from a hash family with independence properties and a flip operation to guarantee monotonicity and near-uniform distribution. The paper proves monotonicity and regularity, and shows constant-time average hashing with constant memory, plus a generalization to any n up to 2^q. Empirical evaluations and implementations demonstrate favorable performance over JumpHash and competitive results against broader consistent hashing schemes, with practical applicability to database sharding and distributed storage.
Abstract
Consistent range-hashing is a technique used in distributed systems, either directly or as a subroutine for consistent hashing, commonly to realize an even and stable data distribution over a variable number of resources. We introduce FlipHash, a consistent range-hashing algorithm with constant time complexity and low memory requirements. Like Jump Consistent Hash, FlipHash is intended for applications where resources can be indexed sequentially. Under this condition, it ensures that keys are hashed evenly across resources and that changing the number of resources only causes keys to be remapped from a removed resource or to an added one, but never shuffled across persisted ones. FlipHash differentiates itself with its low computational cost, achieving constant-time complexity. We show that FlipHash beats Jump Consistent Hash's cost, which is logarithmic in the number of resources, both theoretically and in experiments over practical settings.
