Table of Contents
Fetching ...

STAR: Improving Lifetime and Performance of High-Capacity Modern SSDs Using State-Aware Randomizer

Omin Kwon, Kyungjun Oh, Jaeyong Lee, Myungsuk Kim, Jihong Kim

TL;DR

This work addresses reliability challenges caused by lateral charge spreading (LCS) in high-density 3D NAND flash by designing STAR, a state-aware randomizer that reshapes $V_{th}$ state distributions through group-level bit flips rather than relying on uniform randomization. STAR operates in three stages—LFSR randomization, group error estimation, and optimal bit-flip—storing a Flip Indicator Bit ($\text{FIB}$) for read reversals, implemented with hardware optimizations (Zig-Zag IO Scheduling, Group-Level Pipelining, Parallel Error Estimator) to maintain I/O throughput. Evaluations on 160 real 3D TLC/QLC chips and a STAR-enabled SSD emulator demonstrate substantial gains: up to 2.3x lifetime improvement and notable read-latency reductions (≈$46\%$ TLC, ≈$50\%$ QLC), outperforming TailCut while incurring modest overhead and no NAND chip modifications. The approach is particularly effective for high-density QLC and scalable to PLC, offering a practical, controller-level solution to improve SSD longevity and performance in next-generation flash memory. $E_G = \sum_{i=1}^{G} e_{s_i} = \sum_{k=0}^{15} e_k N_k$ and bit-flip operation $f(b_{\text{LSB}}, b_{\text{CSB}}, b_{\text{MSB}}, b_{\text{TSB}})$ formalize the group error and state transformation underpinning STAR.

Abstract

Although NAND flash memory has achieved continuous capacity improvements via advanced 3D stacking and multi-level cell technologies, these innovations introduce new reliability challenges, particularly lateral charge spreading (LCS), absent in low-capacity 2D flash memory. Since LCS significantly increases retention errors over time, addressing this problem is essential to ensure the lifetime of modern SSDs employing high-capacity 3D flash memory. In this paper, we propose a novel data randomizer, STate-Aware Randomizer (STAR), which proactively eliminates the majority of weak data patterns responsible for retention errors caused by LCS. Unlike existing techniques that target only specific worst-case patterns, STAR effectively removes a broad spectrum of weak patterns, significantly enhancing reliability against LCS. By employing several optimization schemes, STAR can be efficiently integrated into the existing I/O datapath of an SSD controller with negligible timing overhead. To evaluate the proposed STAR scheme, we developed a STAR-aware SSD emulator based on characterization results from 160 real 3D NAND flash chips. Experimental results demonstrate that STAR improves SSD lifetime by up to 2.3x and reduces read latency by an average of 50% on real-world traces compared to conventional SSDs

STAR: Improving Lifetime and Performance of High-Capacity Modern SSDs Using State-Aware Randomizer

TL;DR

This work addresses reliability challenges caused by lateral charge spreading (LCS) in high-density 3D NAND flash by designing STAR, a state-aware randomizer that reshapes state distributions through group-level bit flips rather than relying on uniform randomization. STAR operates in three stages—LFSR randomization, group error estimation, and optimal bit-flip—storing a Flip Indicator Bit () for read reversals, implemented with hardware optimizations (Zig-Zag IO Scheduling, Group-Level Pipelining, Parallel Error Estimator) to maintain I/O throughput. Evaluations on 160 real 3D TLC/QLC chips and a STAR-enabled SSD emulator demonstrate substantial gains: up to 2.3x lifetime improvement and notable read-latency reductions (≈ TLC, ≈ QLC), outperforming TailCut while incurring modest overhead and no NAND chip modifications. The approach is particularly effective for high-density QLC and scalable to PLC, offering a practical, controller-level solution to improve SSD longevity and performance in next-generation flash memory. and bit-flip operation formalize the group error and state transformation underpinning STAR.

Abstract

Although NAND flash memory has achieved continuous capacity improvements via advanced 3D stacking and multi-level cell technologies, these innovations introduce new reliability challenges, particularly lateral charge spreading (LCS), absent in low-capacity 2D flash memory. Since LCS significantly increases retention errors over time, addressing this problem is essential to ensure the lifetime of modern SSDs employing high-capacity 3D flash memory. In this paper, we propose a novel data randomizer, STate-Aware Randomizer (STAR), which proactively eliminates the majority of weak data patterns responsible for retention errors caused by LCS. Unlike existing techniques that target only specific worst-case patterns, STAR effectively removes a broad spectrum of weak patterns, significantly enhancing reliability against LCS. By employing several optimization schemes, STAR can be efficiently integrated into the existing I/O datapath of an SSD controller with negligible timing overhead. To evaluate the proposed STAR scheme, we developed a STAR-aware SSD emulator based on characterization results from 160 real 3D NAND flash chips. Experimental results demonstrate that STAR improves SSD lifetime by up to 2.3x and reduces read latency by an average of 50% on real-world traces compared to conventional SSDs

Paper Structure

This paper contains 20 sections, 4 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: (a) Organization of 3D flash memory. (b) Comparison of V$_{\text{th}}$ distributions between TLC flash memory (top) and QLC flash memory (bottom).
  • Figure 2: (a) Mechanism of LCS. (b) Comparison of TLC data patterns before and after randomization
  • Figure 3: Comparison of inter-state error distributions between TLC flash memory (top) and QLC flash memory (bottom).
  • Figure 4: Top 10 LCS-induced weak patterns of TLC and QLC flash memory.
  • Figure 5: Illustrative example STAR.
  • ...and 5 more figures