Memory Sharing with CXL: Hardware and Software Design Approaches
Sunita Jain, Nagaradhesh Yeleswarapu, Hasan Al Maruf, Rita Gupta
TL;DR
The paper addresses the challenge of memory sharing in CXL-enabled systems across generations, highlighting the limitations of traditional tightly coupled CPU-memory architectures and the potential of CXL to enable memory pooling and sharing. It surveys software-only approaches (dual-headed topology, a custom framework, and OpenSHMEM-based PGAS) and hardware-assisted methods (dual-headed CXL Type-3 devices with hardware atomics and BI coherence), and discusses a hybrid snoop-filter strategy to balance precision and performance. It also analyzes trade-offs in sharing granularity and security, proposing hardware-assisted isolation and selective, region-based coherence to manage overhead and security risks. The work argues that combining software and hardware design is essential to unlock rack-scale memory sharing and near-data processing capabilities enabled by CXL 3.0, paving the way for Global Integrated Memory and memory-disaggregated architectures.
Abstract
Compute Express Link (CXL) is a rapidly emerging coherent interconnect standard that provides opportunities for memory pooling and sharing. Memory sharing is a well-established software feature that improves memory utilization by avoiding unnecessary data movement. In this paper, we discuss multiple approaches to enable memory sharing with different generations of CXL protocol (i.e., CXL 2.0 and CXL 3.0) considering the challenges with each of the architectures from the device hardware and software viewpoint.
