Next-Gen Computing Systems with Compute Express Link: a Comprehensive Survey
Chen Chen, Xinkui Zhao, Guanjie Cheng, Yuesheng Xu, Shuiguang Deng, Jianwei Yin
TL;DR
The paper addresses the growing interconnect bottleneck in modern computing systems by surveying Compute Express Link (CXL) based architectures from single-machine memory expansion to distributed shared memory. It organizes research into Memory Expansion and Unified Memory, and extends the discussion to disaggregated systems and DSM enabled by CXL 3.0, grounded by real hardware measurements and diverse simulation platforms. The work catalogs concrete approaches such as tiered memory, near-memory processing, CPU-relay and direct CXL access for unified memory, memory pooling, and shared memory RPC frameworks, and outlines extensive future research directions including memory interleaving, workload-agnostic offloading, GPU memory extension, virtualization, and cross-rack interconnect. The significance lies in providing a structured, up-to-date map of CXL-enabled memory-centric computing, guiding both academia and industry toward scalable, coherent, and flexible data-center infrastructures.
Abstract
Interconnection is crucial for computing systems. However, the current interconnection performance between processors and devices, such as memory devices and accelerators, significantly lags behind their computing performance, severely limiting the overall performance. To address this challenge, Intel proposes Compute Express Link (CXL), an open industry-standard interconnection. With memory semantics, CXL offers low-latency, scalable, and coherent interconnection between processors and devices. This paper introduces recent advances in CXL-based computing systems from single-machine to distributed. In single-machine systems, we classify existing research into two categories: Memory Expansion and Unified Memory. Memory Expansion focus on processors and memory, aims to address memory wall challenge. Unified memory focus on processors and accelerators, aims to enhance collaboration in heterogeneous computing systems. In distributed systems, we present how to build efficient disaggregation systems based on CXL infrastructure, enabling resource pooling and sharing. Finally, we discuss the future research and envision memory-centric computing with CXL.
