A Programming Model for Disaggregated Memory over CXL
Gal Assa, Lucas Bürgi, Michal Friedman, Ori Lahav
TL;DR
CXL0 addresses the lack of a formal programming model for coherent disaggregated memory over CXL by introducing an abstract, executable system and failure model, plus an operational semantics. It presents three general transformations that render linearizable algorithms durable under partial failures, and an additional transformation for any durably linearizable object, all with formal correctness guarantees. The authors provide initial hardware measurements mapping CXL0 primitives and report latency/throughput insights, illustrating practical implications for software and hardware designers. Overall, the work establishes a foundational framework for reasoning about and constructing correct, durable algorithms on CXL-based disaggregated memory, guiding future standardization and modeling efforts.
Abstract
CXL (Compute Express Link) is an emerging open industry-standard interconnect between processing and memory devices that is expected to revolutionize the way systems are designed in the near future. It enables cache-coherent shared memory pools in a disaggregated fashion at unprecedented scales, allowing algorithms to interact with a variety of storage devices using simple loads and stores. Alongside unleashing unique opportunities for a wide range of applications, CXL introduces new challenges of data management and crash consistency. Alas, CXL lacks an adequate programming model, which makes reasoning about the correctness and expected behaviors of algorithms and systems on top of it nearly impossible. In this work, we present CXL0, the first programming model for concurrent programs running on top of CXL. We propose a high-level abstraction for CXL memory accesses and formally define operational semantics on top of that abstraction. We perform initial measurements that provide practical insight into CXL0. We provide a set of general transformations that adapt concurrent algorithms to the new disruptive technology. These transformations enhance linearizable algorithms with durability under a general partial-failure model. We provide an additional transformation for algorithms designed for persistent main memory and full-system crashes. We believe that this work will serve as a stepping stone for systems design and modeling on top of CXL, and support the development of future models as software and hardware evolve.
