Relaxation for Efficient Asynchronous Queues
Samuel Baldwin, Cole Hausman, Mohamed Bakr, Edward Talmage
TL;DR
This work tackles efficient shared data structures in fully asynchronous message-passing systems. It introduces a fully replicated asynchronous FIFO queue using vector clocks and Confirmation Lists to achieve a worst-case per-operation cost of $2d$, and then extends the approach with a relaxed $k$-Out-of-Order queue that allows Dequeue to complete with mostly local work by allocating ownership of the $k$ oldest elements, reducing amortized costs at the expense of strict ordering. Key contributions include the first published fully distributed replicated FIFO queue in an asynchronous setting, and a novel asynchronous relaxed queue with provable correctness and a tunable performance/ordering trade-off, together with a formal framework for linearizability and ownership-based fast-paths. The results demonstrate that relaxation can practically circumvent traditional lower bounds in asynchronous distributed systems, enabling significantly faster common-case access while maintaining correctness, and they lay groundwork for fault-tolerant extensions in the future. Collectively, the paper provides a concrete pathway to scalable, high-performance asynchronous shared data structures with adaptable guarantees.
Abstract
We explore the problem of efficiently implementing shared data structures in an asynchronous computing environment. We start with a traditional FIFO queue, showing that full replication is possible with a delay of only a single round-trip message between invocation and response of each operation. This is optimal, or near-optimal, runtime for the Dequeue operation. We then consider ways to circumvent this limitation on performance. Though we cannot improve the worst-case time per operation instance, we show that relaxation, weakening the ordering guarantees of the Queue data type, allows most Dequeue instances to return after only local computation, giving a low amortized cost per instance. This performance is tunable, giving a customizable tradeoff between the ordering of data and the speed of access
