The Dawn of Disaggregation and the Coherence Conundrum: A Call for Federated Coherence
Jaewan Hong, Marcos K. Aguilera, Emmanuel Amaro, Vincent Liu, Aurojit Panda, Ion Stoica
TL;DR
This work argues that global cache coherence is impractical for disaggregated memory in large data centers due to scalability and hardware complexity. It formally defines Federated Coherence, a node-local coherence model that eliminates inter-node coherence traffic and leverages existing intra-node coherence, enabling scalable programming with simple paradigms. The authors describe node ownership, immutability, versioning, and a coordination library as the core primitives, and illustrate applications across microservices, publish-subscribe, and immutable object stores. A call to action urges hardware-software communities to converge on a feasible model, develop end-to-end benchmarks, and explore language support and tooling to harness federated coherence in real systems.
Abstract
Disaggregated memory is an upcoming data center technology that will allow nodes (servers) to share data efficiently. Sharing data creates a debate on the level of cache coherence the system should provide. While current proposals aim to provide coherence for all or parts of the disaggregated memory, we argue that this approach is problematic, because of scalability limitations and hardware complexity. Instead, we propose and formally define federated coherence, a model that provides coherence only within nodes, not across nodes. Federated coherence can use current intra-node coherence provided by processors without requiring expensive mechanisms for inter-node coherence. Developers can use federated coherence with a few simple programming paradigms and a synchronization library. We sketch some potential applications.
