Table of Contents
Fetching ...

The Case for Persistent CXL switches

Khan Shaikhul Hadi, Naveed Ul Mustafa, Mark Heinrich, Yan Solihin

TL;DR

The paper addresses the high latency of persisting updates in CXL-attached persistent memory by introducing a memory-centric Persistent CXL Switch (PCS) that embeds a Persistent Buffer (PB) at the switch. The authors design a system-independent PB with a corresponding Persist Buffer Controller (PBC) and Selector (PBCS) to ensure correctness (read/write order and crash consistency) while overlapping persistence with normal switch operation. A Read Forwarding (RF) optimization further reduces latency by serving reads from PB when possible. Experimental results on Splash-4 workloads show average persist-speedups of 12% (PB) and 15% (PB_RF) over a volatile CXL switch, with substantial reductions in persist latency and selective improvements in read latency, demonstrating practical benefits for crash-consistent data-intensive applications.

Abstract

Compute Express Link (CXL) switch allows memory extension via PCIe physical layer to address increasing demand for larger memory capacities in data centers. However, CXL attached memory introduces 170ns to 400ns memory latency. This becomes a significant performance bottleneck for applications that host data in persistent memory as all updates, after traversing the CXL switch, must reach persistent domain to ensure crash consistent updates. We make a case for persistent CXL switch to persist updates as soon as they reach the switch and hence significantly reduce latency of persisting data. To enable this, we presented a system independent persistent buffer (PB) design that ensures data persistency at CXL switch. Our PB design provides 12\% speedup, on average, over volatile CXL switch. Our \textit{read forwarding} optimization improves speedup to 15\%.

The Case for Persistent CXL switches

TL;DR

The paper addresses the high latency of persisting updates in CXL-attached persistent memory by introducing a memory-centric Persistent CXL Switch (PCS) that embeds a Persistent Buffer (PB) at the switch. The authors design a system-independent PB with a corresponding Persist Buffer Controller (PBC) and Selector (PBCS) to ensure correctness (read/write order and crash consistency) while overlapping persistence with normal switch operation. A Read Forwarding (RF) optimization further reduces latency by serving reads from PB when possible. Experimental results on Splash-4 workloads show average persist-speedups of 12% (PB) and 15% (PB_RF) over a volatile CXL switch, with substantial reductions in persist latency and selective improvements in read latency, demonstrating practical benefits for crash-consistent data-intensive applications.

Abstract

Compute Express Link (CXL) switch allows memory extension via PCIe physical layer to address increasing demand for larger memory capacities in data centers. However, CXL attached memory introduces 170ns to 400ns memory latency. This becomes a significant performance bottleneck for applications that host data in persistent memory as all updates, after traversing the CXL switch, must reach persistent domain to ensure crash consistent updates. We make a case for persistent CXL switch to persist updates as soon as they reach the switch and hence significantly reduce latency of persisting data. To enable this, we presented a system independent persistent buffer (PB) design that ensures data persistency at CXL switch. Our PB design provides 12\% speedup, on average, over volatile CXL switch. Our \textit{read forwarding} optimization improves speedup to 15\%.

Paper Structure

This paper contains 25 sections, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Normalized latency of persist operations in FFT.
  • Figure 2: Potential time savings for persistent CXL switch(PCS)
  • Figure 3: Persistent Buffer Design for Persistent CXL switch.
  • Figure 4: Workflow of PB for read,write and acknowledgment packet.
  • Figure 5: Speedup of PB and PB_RF over NoPB (higher is better).
  • ...and 3 more figures