Table of Contents
Fetching ...

From Good to Great: Improving Memory Tiering Performance Through Parameter Tuning

Konstantinos Kanellis, Sujay Yadalam, Fanchao Chen, Michael Swift, Shivaram Venkataraman

TL;DR

The paper tackles suboptimal memory tiering performance caused by heuristics and static thresholds by tuning knobs with Bayesian Optimization to reflect application behavior and hardware. It applies the tuning pipeline to two tiering engines, HeMem and HMSDK, and demonstrates substantive performance gains. Across diverse workloads and system configurations, the tuned configurations achieve up to about twofold improvements over default and up to about 1.56x over Memtis, illustrating the value of workload-aware knob tuning. The work provides insights into workload-specific migration patterns and offers guidance for designing more adaptive memory tiering systems.

Abstract

Memory tiering systems achieve memory scaling by adding multiple tiers of memory wherein different tiers have different access latencies and bandwidth. For maximum performance, frequently accessed (hot) data must be placed close to the host in faster tiers and infrequently accessed (cold) data can be placed in farther slower memory tiers. Existing tiering solutions employ heuristics and pre-configured thresholds to make data placement and migration decisions. Unfortunately, these systems fail to adapt to different workloads and the underlying hardware, so perform sub-optimally. In this paper, we improve performance of memory tiering by using application behavior knowledge to set various parameters (knobs) in existing tiering systems. To do so, we leverage Bayesian Optimization to discover the good performing configurations that capture the application behavior and the underlying hardware characteristics. We find that Bayesian Optimization is able to learn workload behaviors and set the parameter values that result in good performance. We evaluate this approach with existing tiering systems, HeMem and HMSDK. Our evaluation reveals that configuring the parameter values correctly can improve performance by 2x over the same systems with default configurations and 1.56x over state-of-the-art tiering system.

From Good to Great: Improving Memory Tiering Performance Through Parameter Tuning

TL;DR

The paper tackles suboptimal memory tiering performance caused by heuristics and static thresholds by tuning knobs with Bayesian Optimization to reflect application behavior and hardware. It applies the tuning pipeline to two tiering engines, HeMem and HMSDK, and demonstrates substantive performance gains. Across diverse workloads and system configurations, the tuned configurations achieve up to about twofold improvements over default and up to about 1.56x over Memtis, illustrating the value of workload-aware knob tuning. The work provides insights into workload-specific migration patterns and offers guidance for designing more adaptive memory tiering systems.

Abstract

Memory tiering systems achieve memory scaling by adding multiple tiers of memory wherein different tiers have different access latencies and bandwidth. For maximum performance, frequently accessed (hot) data must be placed close to the host in faster tiers and infrequently accessed (cold) data can be placed in farther slower memory tiers. Existing tiering solutions employ heuristics and pre-configured thresholds to make data placement and migration decisions. Unfortunately, these systems fail to adapt to different workloads and the underlying hardware, so perform sub-optimally. In this paper, we improve performance of memory tiering by using application behavior knowledge to set various parameters (knobs) in existing tiering systems. To do so, we leverage Bayesian Optimization to discover the good performing configurations that capture the application behavior and the underlying hardware characteristics. We find that Bayesian Optimization is able to learn workload behaviors and set the parameter values that result in good performance. We evaluate this approach with existing tiering systems, HeMem and HMSDK. Our evaluation reveals that configuring the parameter values correctly can improve performance by 2x over the same systems with default configurations and 1.56x over state-of-the-art tiering system.

Paper Structure

This paper contains 22 sections, 1 equation, 13 figures, 5 tables.

Figures (13)

  • Figure 1: Execution time (in seconds) of GUPS (left) and Silo (right) workloads, when we tweak two HeMem parameters. Default configuration execution time is shown in red box.
  • Figure 2: Performance improvement of best HeMem configuration found over default for all workloads.
  • Figure 3: GapBS-BC: Number of migrations over time.
  • Figure 4: GapBS-PR: Memory access pattern over time.
  • Figure 5: XSBench: Memory access pattern over time.
  • ...and 8 more figures