Formal Specification for Fast ACS: Low-Latency File-Based Ordered Message Delivery at Scale
Sushant Kumar Gupta, Anil Raghunath Iyer, Chang Yu, Neel Bagora, Olivier Pomerleau, Vivek Kumar, Prunthaban Kanthakumar
TL;DR
Fast ACS presents a file-based, multi-layer storage system for low-latency, ordered message delivery at internet scale. It combines intra-cluster RMA reads with inter-cluster RPC transfers, uses a Prim MST copy-tree for efficient cross-cluster replication, and employs a two-tier cache (data and metadata) atop Colossus to achieve high throughput with low tail latency. The design is validated through large-scale experiments showing sub-second p99 delays and Tbps-scale bandwidth, along with extensive production experience in Google Ads context. A complementary formal specification in TLA+ demonstrates safety and eventual progress for the dueling-writers cache model, underpinning the system's reliability guarantees.
Abstract
Low-latency message delivery is crucial for real-time systems. Data originating from a producer must be delivered to consumers, potentially distributed in clusters across metropolitan and continental boundaries. With the growing scale of computing, there can be several thousand consumers of the data. Such systems require a robust messaging system capable of transmitting messages containing data across clusters and efficiently delivering them to consumers. The system must offer guarantees like ordering and at-least-once delivery while avoiding overload on consumers, allowing them to consume messages at their own pace. This paper presents the design of Fast ACS (an abbreviation for Ads Copy Service), a file-based ordered message delivery system that leverages a combination of two-sided (inter-cluster) and one-sided (intra-cluster) communication primitives - namely, Remote Procedure Call and Remote Memory Access, respectively - to deliver messages. The system has been successfully deployed to dozens of production clusters and scales to accommodate several thousand consumers within each cluster, which amounts to Tbps-scale intra-cluster consumer traffic at peak. Notably, Fast ACS delivers messages to consumers across the globe within a few seconds or even sub-seconds (p99) based on the message volume and consumer scale, at a low resource cost.
