The Design and Implementation of a High-Performance Log-Structured RAID System for ZNS SSDs
Jinhong Li, Yiyang Geng, Qiuping Wang, Shujie Han, Patrick P. C. Lee
TL;DR
ZapRAID tackles the challenge of building a scalable, high-performance RAID over ZNS SSDs by combining Zone Append-driven intra-zone parallelism with a group-based stripe layout and a hybrid data-management strategy that also employs Zone Write for inter-zone parallelism. It extends Log-RAID with a compact, per-group stripe management scheme and introduces L2P offloading to reduce in-memory footprint while ensuring crash consistency through header/footer metadata and resilient stripe reconstruction. The implementation in SPDK demonstrates substantially higher small-write throughput, competitive large-write performance, and strong degraded-read and crash-recovery behavior, along with significantly reduced memory usage for index structures. These results indicate ZapRAID’s practical viability for real-world, mixed workloads on ZNS arrays, offering a pathway to scalable, reliable, and efficient zone-based RAID storage.
Abstract
Zoned Namespace (ZNS) defines a new abstraction for host software to flexibly manage storage in flash-based SSDs as append-only zones. It also provides a Zone Append primitive to further boost the write performance of ZNS SSDs by exploiting intra-zone parallelism. However, making Zone Append effective for reliable and scalable storage, in the form of a RAID array of multiple ZNS SSDs, is non-trivial, since Zone Append offloads address management to ZNS SSDs and requires hosts to specifically manage RAID stripes across multiple drives. We propose ZapRAID, a high-performance log-structured RAID system for ZNS SSDs by carefully exploiting Zone Append to achieve high write parallelism and lightweight stripe management. ZapRAID adopts a group-based data layout with a coarse-grained ordering across multiple groups of stripes, such that it can use small-size metadata for stripe management on a per-group basis under Zone Append. It further adopts hybrid data management to simultaneously achieve intra-zone and inter-zone parallelism through a careful combination of both Zone Write and Zone Append primitives. We implement ZapRAID as a user-space block device, and evaluate ZapRAID using microbenchmarks, trace-driven experiments, and real-application experiments. Our evaluation results show that ZapRAID achieves high write throughput and maintains high performance in normal reads, degraded reads, crash recovery, and full-drive recovery.
