Table of Contents
Fetching ...

Flexible In-NAND Cryptographic Processing for Secure Flash Storage

Seock-Hwan Noh, Hoyeon Lee, Junkyum Kim, Junsu Im, Jay H. Park, Sungjin Lee, Sam H. Noh, Yeseong Kim, Jaeha Kung

TL;DR

This work addresses the security and performance limitations of conventional SSD encryption by moving cryptographic processing into the NAND die using FlashVault. It presents a reconfigurable in-NAND cryptographic engine capable of handling block ciphers, PKC, and PQC within the unused space of 4D V-NAND, eliminating off-chip data exposure and host-side vulnerabilities. The authors detail a hardware architecture with BCE and ACE modules, LDPC-based on-die ECC, and a PUF-based key management path, plus runtime algorithm reconfiguration. Experimental results from post-layout simulations demonstrate substantial latency and throughput benefits over CPU-based encryption and near-core processing, while meeting boot-time verification constraints and maintaining a feasible die area and power budget. The approach promises practical, standards-compliant secure SSD deployment with broad cryptographic support and quantum-resilient capabilities.

Abstract

We present FlashVault, an in-NAND self-encryption architecture that embeds a reconfigurable cryptographic engine into the unused silicon area of a state-of-the-art 4D V-NAND structure. FlashVault supports not only block ciphers for data encryption but also public-key and post-quantum algorithms for digital signatures, all within the NAND flash chip. This design enables each NAND chip to operate as a self-contained enclave without incurring area overhead, while eliminating the need for off-chip encryption. We implement FlashVault at the register-transfer level (RTL) and perform place-and-route (P&R) for accurate power/area evaluation. Our analysis shows that the power budget determines the number of cryptographic engines per NAND chip. We integrate this architectural choice into a full-system simulation and evaluate its performance on a wide range of cryptographic algorithms. Our results show that FlashVault consistently outperforms both CPU-based encryption (1.46~3.45x) and near-core processing architecture (1.02~2.01x), demonstrating its effectiveness as a secure SSD architecture that meets diverse cryptographic requirements imposed by regulatory standards and enterprise policies.

Flexible In-NAND Cryptographic Processing for Secure Flash Storage

TL;DR

This work addresses the security and performance limitations of conventional SSD encryption by moving cryptographic processing into the NAND die using FlashVault. It presents a reconfigurable in-NAND cryptographic engine capable of handling block ciphers, PKC, and PQC within the unused space of 4D V-NAND, eliminating off-chip data exposure and host-side vulnerabilities. The authors detail a hardware architecture with BCE and ACE modules, LDPC-based on-die ECC, and a PUF-based key management path, plus runtime algorithm reconfiguration. Experimental results from post-layout simulations demonstrate substantial latency and throughput benefits over CPU-based encryption and near-core processing, while meeting boot-time verification constraints and maintaining a feasible die area and power budget. The approach promises practical, standards-compliant secure SSD deployment with broad cryptographic support and quantum-resilient capabilities.

Abstract

We present FlashVault, an in-NAND self-encryption architecture that embeds a reconfigurable cryptographic engine into the unused silicon area of a state-of-the-art 4D V-NAND structure. FlashVault supports not only block ciphers for data encryption but also public-key and post-quantum algorithms for digital signatures, all within the NAND flash chip. This design enables each NAND chip to operate as a self-contained enclave without incurring area overhead, while eliminating the need for off-chip encryption. We implement FlashVault at the register-transfer level (RTL) and perform place-and-route (P&R) for accurate power/area evaluation. Our analysis shows that the power budget determines the number of cryptographic engines per NAND chip. We integrate this architectural choice into a full-system simulation and evaluate its performance on a wide range of cryptographic algorithms. Our results show that FlashVault consistently outperforms both CPU-based encryption (1.46~3.45x) and near-core processing architecture (1.02~2.01x), demonstrating its effectiveness as a secure SSD architecture that meets diverse cryptographic requirements imposed by regulatory standards and enterprise policies.

Paper Structure

This paper contains 29 sections, 2 equations, 14 figures, 4 tables.

Figures (14)

  • Figure 1: Overview of FlashVault, which embeds a reconfigurable cryptographic engine beneath the 4D V-NAND to perform data encryption and signing directly within NAND. This in-NAND design enables low-latency encryption and prevents plaintext exposure by avoiding off-chip data movement.
  • Figure 2: Device structures of (a) a floating gate transistor and (b) a charge trap flash. Structures of (c) 2D, (d) 3D and (e) 4D NAND flash memories.
  • Figure 3: (a) Area breakdown of a 512 Gb 3D V-NAND chip into the cell array and peripheral circuits for three major vendors. Peripheral circuits include components such as row decoders, page buffers, and charge pumps. (b) Comparison of die sizes between 128-layer 3D NAND and 176-layer 4D NAND flash from the same vendors. Sources: piftech_insight_surveyhynix_die_areaHynix_4d_nand3d_4d_comparison_nand.
  • Figure 4: Internal architecture of modern SSDs.
  • Figure 5: Normalized latency breakdown of decryption (Dec.) and encryption (Enc.) using six representative block cipher algorithms on CPU and FlashVault (FV). Each bar shows I/O latency (green), cipher latency (gray), with red circles indicating total latency normalized to the CPU implementation.
  • ...and 9 more figures