ByteFS: System Support for (CXL-based) Memory-Semantic Solid-State Drives
Shaobo Li, Yirui Eric Zhou, Hao Ren, Jian Huang
TL;DR
ByteFS tackles block-only I/O limitations on memory-semantic SSDs by enabling a dual byte/block data access model and introducing firmware-level log-structured memory with a log-coalescing write path. It combines a skip-list-based log index, transaction-oriented updates, and coordinated host/SSD caching to achieve crash-consistent, low-amplification data management. Empirical evaluation on real hardware and an emulator shows ByteFS delivers up to $2.7\times$ throughput improvements and up to $5.1\times$ reductions in write traffic across diverse workloads, validating its effectiveness and practicality. The work demonstrates a viable path to harness memory-semantic storage for scalable, cost-effective storage systems while preserving core file-system properties.
Abstract
Unlike non-volatile memory that resides on the processor memory bus, memory-semantic solid-state drives (SSDs) support both byte and block access granularity via PCIe or CXL interconnects. They provide scalable memory capacity using NAND flash at a much lower cost. In addition, they have different performance characteristics for their dual byte/block interface respectively, while offering essential memory semantics for upper-level software. Such a byte-accessible storage device provides new implications on the software system design. In this paper, we develop a new file system, named ByteFS, by rethinking the design primitives of file systems and SSD firmware to exploit the advantages of both byte and block-granular data accesses. ByteFS supports byte-granular data persistence to retain the persistence nature of SSDs. It extends the core data structure of file systems by enabling dual byte/block-granular data accesses. To facilitate the support for byte-granular writes, \pname{} manages the internal DRAM of SSD firmware in a log-structured manner and enables data coalescing to reduce the unnecessary I/O traffic to flash chips. ByteFS also enables coordinated data caching between the host page cache and SSD cache for best utilizing the precious memory resource. We implement ByteFS on both a real programmable SSD and an emulated memory-semantic SSD for sensitivity study. Compared to state-of-the-art file systems for non-volatile memory and conventional SSDs, ByteFS outperforms them by up to 2.7$\times$, while preserving the essential properties of a file system. ByteFS also reduces the write traffic to SSDs by up to 5.1$\times$ by alleviating unnecessary writes caused by both metadata and data updates in file systems.
