Table of Contents
Fetching ...

Revisiting Computational Storage for Data Integrity and Security

Chao Shi, Anthony Manschula, Tabassum Mahmud, Zeren Yang, Mai Zheng, Yong Chen, Jim Wayda, Matthew Wolf, Byungwoo Bang

TL;DR

The paper tackles data integrity and security in computational storage by introducing a dedicated library (CSDGuard) that co-opts FI, EC, and RDR techniques for on-disk protection. It adopts the Computational Storage Architecture Programming Model to formalize host–device interactions, IO interception, and multi-dimensional array operations, and demonstrates a Samsung SmartSSD prototype with an on-drive FPGA and P2P data transfer. Empirical results show substantial latency improvements for on-device processing (up to 70%), with hardware kernels offering significant speedups at smaller data sizes (roughly 3x) that diminish at larger scales (≈1.4x). The work argues for the viability of CSD-based reliability and security primitives and outlines future directions to extend FI/EC/RDR use cases and benchmark against Ceph/HDFS configurations.

Abstract

The idea of computational storage device (CSD) has come a long way since at least 1990s [1], [2]. By embedding computing resources within storage devices, CSDs could potentially offload computational tasks from CPUs and enable near-data processing (NDP), reducing data movements and/or energy consumption significantly. While the initial hard-disk-based CSDs suffer from severe limitations in terms of on-drive resources, programmability, etc., the storage market has witnessed the commercialization of solid-state-drive (SSD) based CSDs (e.g., Samsung SmartSSD [3], ScaleFlux CSDs [4]) recently, which has enabled CSD-based optimizations for avariety of application scenarios (e.g., [5], [6], [7]).

Revisiting Computational Storage for Data Integrity and Security

TL;DR

The paper tackles data integrity and security in computational storage by introducing a dedicated library (CSDGuard) that co-opts FI, EC, and RDR techniques for on-disk protection. It adopts the Computational Storage Architecture Programming Model to formalize host–device interactions, IO interception, and multi-dimensional array operations, and demonstrates a Samsung SmartSSD prototype with an on-drive FPGA and P2P data transfer. Empirical results show substantial latency improvements for on-device processing (up to 70%), with hardware kernels offering significant speedups at smaller data sizes (roughly 3x) that diminish at larger scales (≈1.4x). The work argues for the viability of CSD-based reliability and security primitives and outlines future directions to extend FI/EC/RDR use cases and benchmark against Ceph/HDFS configurations.

Abstract

The idea of computational storage device (CSD) has come a long way since at least 1990s [1], [2]. By embedding computing resources within storage devices, CSDs could potentially offload computational tasks from CPUs and enable near-data processing (NDP), reducing data movements and/or energy consumption significantly. While the initial hard-disk-based CSDs suffer from severe limitations in terms of on-drive resources, programmability, etc., the storage market has witnessed the commercialization of solid-state-drive (SSD) based CSDs (e.g., Samsung SmartSSD [3], ScaleFlux CSDs [4]) recently, which has enabled CSD-based optimizations for avariety of application scenarios (e.g., [5], [6], [7]).

Paper Structure

This paper contains 1 section, 6 figures.

Table of Contents

  1. Introduction

Figures (6)

  • Figure 1: Operation of the Samsung SmartSSD-based approach with P2P.
  • Figure 2: Operation of the system using the CPU-based approach.
  • Figure 3: Kernel processing time for each data size, measured in seconds.
  • Figure 4: Time to complete data transfers of 576KiB, 4MiB, and 9.2MiB, measured in microseconds.
  • Figure 5: End-to-End program execution time, including kernel and data transfer times, measured in seconds.
  • ...and 1 more figures