Table of Contents
Fetching ...

Machine Learning-Driven Intelligent Memory System Design: From On-Chip Caches to Storage

Rahul Bera, Rakesh Nadig, Onur Mutlu

Abstract

Despite the data-rich environment in which memory systems of modern computing platforms operate, many state-of-the-art architectural policies employed in the memory system rely on static, human-designed heuristics that fail to truly adapt to the workload and system behavior via principled learning methodologies. In this article, we propose a fundamentally different design approach: using lightweight and practical machine learning (ML) methods to enable adaptive, data-driven control throughout the memory hierarchy. We present three ML-guided architectural policies: (1) Pythia, a reinforcement learning-based data prefetcher for on-chip caches, (2) Hermes, a perceptron learning-based off-chip predictor for multi-level cache hierarchies, and (3) Sibyl, a reinforcement learning-based data placement policy for hybrid storage systems. Our evaluation shows that Pythia, Hermes, and Sibyl significantly outperform the best-prior human-designed policies, while incurring modest hardware overheads. Collectively, this article demonstrates that integrating adaptive learning into memory subsystems can lead to intelligent, self-optimizing architectures that unlock performance and efficiency gains beyond what is possible with traditional human-designed approaches.

Machine Learning-Driven Intelligent Memory System Design: From On-Chip Caches to Storage

Abstract

Despite the data-rich environment in which memory systems of modern computing platforms operate, many state-of-the-art architectural policies employed in the memory system rely on static, human-designed heuristics that fail to truly adapt to the workload and system behavior via principled learning methodologies. In this article, we propose a fundamentally different design approach: using lightweight and practical machine learning (ML) methods to enable adaptive, data-driven control throughout the memory hierarchy. We present three ML-guided architectural policies: (1) Pythia, a reinforcement learning-based data prefetcher for on-chip caches, (2) Hermes, a perceptron learning-based off-chip predictor for multi-level cache hierarchies, and (3) Sibyl, a reinforcement learning-based data placement policy for hybrid storage systems. Our evaluation shows that Pythia, Hermes, and Sibyl significantly outperform the best-prior human-designed policies, while incurring modest hardware overheads. Collectively, this article demonstrates that integrating adaptive learning into memory subsystems can lead to intelligent, self-optimizing architectures that unlock performance and efficiency gains beyond what is possible with traditional human-designed approaches.
Paper Structure (32 sections, 4 equations, 9 figures, 1 table)

This paper contains 32 sections, 4 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: (a) Coverage, overprediction, and (b) performance comparison of two recently-proposed prefetchers: SPP and Bingo. (c) Percentage of loads that miss the LLC and go off-chip (on the left y-axis) and the LLC MPKI (on the right y-axis) in the baseline system with a state-of-the-art prefetcher. (d) Accuracy of two off-chip predictors, HMP and TTP. Average request latency of CDE and RNN-HSS on (e) performance-oriented and (f) cost-oriented HSS. The average request latency is normalized to Fast-Only policy. =1000 Figures adapted from our MICRO 2021 pythia, MICRO 2022 hermes, and ISCA 2022 singh2022sibyl papers. =100 Figures adapted from our MICRO 2021 pythia, MICRO 2022 hermes, and ISCA 2022 singh2022sibyl papers. = Figures adapted from our MICRO 2021 pythia, MICRO 2022 hermes, and ISCA 2022 singh2022sibyl papers. Figures adapted from our MICRO 2021 pythia, MICRO 2022 hermes, and ISCA 2022 singh2022sibyl papers.
  • Figure 2: Overview of a reinforcement learning system.
  • Figure 3: Overview of a single-layer perceptron model.
  • Figure 4: (a) Formulating prefetcher as an RL agent. (b) Overview of Pythia. =1000 Figures adapted from our MICRO 2021 paper pythia. =100 Figures adapted from our MICRO 2021 paper pythia. = Figures adapted from our MICRO 2021 paper pythia. Figures adapted from our MICRO 2021 paper pythia.
  • Figure 5: Average performance improvement of =1000 prior prefetchers and Pythia =100 prior prefetchers and Pythia = prior prefetchers and Pythia prior prefetchers and Pythia in systems with varying (a) number of cores and (b) DRAM million transfers per second (MTPS). =1000 Figures adapted from our MICRO 2021 paper pythia =100 Figures adapted from our MICRO 2021 paper pythia = Figures adapted from our MICRO 2021 paper pythia Figures adapted from our MICRO 2021 paper pythia .
  • ...and 4 more figures