The Case for Persistent CXL switches
Khan Shaikhul Hadi, Naveed Ul Mustafa, Mark Heinrich, Yan Solihin
TL;DR
The paper addresses the high latency of persisting updates in CXL-attached persistent memory by introducing a memory-centric Persistent CXL Switch (PCS) that embeds a Persistent Buffer (PB) at the switch. The authors design a system-independent PB with a corresponding Persist Buffer Controller (PBC) and Selector (PBCS) to ensure correctness (read/write order and crash consistency) while overlapping persistence with normal switch operation. A Read Forwarding (RF) optimization further reduces latency by serving reads from PB when possible. Experimental results on Splash-4 workloads show average persist-speedups of 12% (PB) and 15% (PB_RF) over a volatile CXL switch, with substantial reductions in persist latency and selective improvements in read latency, demonstrating practical benefits for crash-consistent data-intensive applications.
Abstract
Compute Express Link (CXL) switch allows memory extension via PCIe physical layer to address increasing demand for larger memory capacities in data centers. However, CXL attached memory introduces 170ns to 400ns memory latency. This becomes a significant performance bottleneck for applications that host data in persistent memory as all updates, after traversing the CXL switch, must reach persistent domain to ensure crash consistent updates. We make a case for persistent CXL switch to persist updates as soon as they reach the switch and hence significantly reduce latency of persisting data. To enable this, we presented a system independent persistent buffer (PB) design that ensures data persistency at CXL switch. Our PB design provides 12\% speedup, on average, over volatile CXL switch. Our \textit{read forwarding} optimization improves speedup to 15\%.
