Revisiting Computational Storage for Data Integrity and Security
Chao Shi, Anthony Manschula, Tabassum Mahmud, Zeren Yang, Mai Zheng, Yong Chen, Jim Wayda, Matthew Wolf, Byungwoo Bang
TL;DR
The paper tackles data integrity and security in computational storage by introducing a dedicated library (CSDGuard) that co-opts FI, EC, and RDR techniques for on-disk protection. It adopts the Computational Storage Architecture Programming Model to formalize host–device interactions, IO interception, and multi-dimensional array operations, and demonstrates a Samsung SmartSSD prototype with an on-drive FPGA and P2P data transfer. Empirical results show substantial latency improvements for on-device processing (up to 70%), with hardware kernels offering significant speedups at smaller data sizes (roughly 3x) that diminish at larger scales (≈1.4x). The work argues for the viability of CSD-based reliability and security primitives and outlines future directions to extend FI/EC/RDR use cases and benchmark against Ceph/HDFS configurations.
Abstract
The idea of computational storage device (CSD) has come a long way since at least 1990s [1], [2]. By embedding computing resources within storage devices, CSDs could potentially offload computational tasks from CPUs and enable near-data processing (NDP), reducing data movements and/or energy consumption significantly. While the initial hard-disk-based CSDs suffer from severe limitations in terms of on-drive resources, programmability, etc., the storage market has witnessed the commercialization of solid-state-drive (SSD) based CSDs (e.g., Samsung SmartSSD [3], ScaleFlux CSDs [4]) recently, which has enabled CSD-based optimizations for avariety of application scenarios (e.g., [5], [6], [7]).
