Table of Contents
Fetching ...

Getting the MOST out of your Storage Hierarchy with Mirror-Optimized Storage Tiering

Kaiwei Tu, Kan Wu, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau

TL;DR

MOST addresses the challenge of heterogeneous storage hierarchies by blending mirroring with tiering to achieve high bandwidth without frequent data migrations. It introduces a two-class data layout (mirrored and tiered) and a dynamic control loop (Optimizer, OffloadRatio, Migrator) to route reads/writes and migrate data with minimal movement. Implemented as Cerberus on CacheLib, MOST demonstrates substantial throughput gains and reduced writes across static and dynamic workloads, plus strong performance on production traces and YCSB-style benchmarks. The approach offers a practical path to exploiting modern storage devices’ bandwidth while mitigating tail latency and wear, with clear directions for extending to more tiers and QoS mechanisms.

Abstract

We present Mirror-Optimized Storage Tiering (MOST), a novel tiering-based approach optimized for modern storage hierarchies. The key idea of MOST is to combine the load balancing advantages of mirroring with the space-efficiency advantages of tiering. Specifically, MOST dynamically mirrors a small amount of hot data across storage tiers to efficiently balance load, avoiding costly migrations. As a result, MOST is as space-efficient as classic tiering while achieving better bandwidth utilization under I/O-intensive workloads. We implement MOST in Cerberus, a user-level storage management layer based on CacheLib. We show the efficacy of Cerberus through a comprehensive empirical study: across a range of static and dynamic workloads, Cerberus achieves better throughput than competing approaches on modern storage hierarchies especially under I/O-intensive and dynamic workloads.

Getting the MOST out of your Storage Hierarchy with Mirror-Optimized Storage Tiering

TL;DR

MOST addresses the challenge of heterogeneous storage hierarchies by blending mirroring with tiering to achieve high bandwidth without frequent data migrations. It introduces a two-class data layout (mirrored and tiered) and a dynamic control loop (Optimizer, OffloadRatio, Migrator) to route reads/writes and migrate data with minimal movement. Implemented as Cerberus on CacheLib, MOST demonstrates substantial throughput gains and reduced writes across static and dynamic workloads, plus strong performance on production traces and YCSB-style benchmarks. The approach offers a practical path to exploiting modern storage devices’ bandwidth while mitigating tail latency and wear, with clear directions for extending to more tiers and QoS mechanisms.

Abstract

We present Mirror-Optimized Storage Tiering (MOST), a novel tiering-based approach optimized for modern storage hierarchies. The key idea of MOST is to combine the load balancing advantages of mirroring with the space-efficiency advantages of tiering. Specifically, MOST dynamically mirrors a small amount of hot data across storage tiers to efficiently balance load, avoiding costly migrations. As a result, MOST is as space-efficient as classic tiering while achieving better bandwidth utilization under I/O-intensive workloads. We implement MOST in Cerberus, a user-level storage management layer based on CacheLib. We show the efficacy of Cerberus through a comprehensive empirical study: across a range of static and dynamic workloads, Cerberus achieves better throughput than competing approaches on modern storage hierarchies especially under I/O-intensive and dynamic workloads.

Paper Structure

This paper contains 27 sections, 21 figures, 5 tables, 1 algorithm.

Figures (21)

  • Figure 1: MOST Data Layout.Data are logically grouped but not physically placed together on the device.
  • Figure 2: MOST Architecture.
  • Figure 3: CacheLib Architecture. This figure shows CacheLib’s architecture and the lookup workflow. A lookup first checks the DRAM cache ( ) and immediately returns the object on a hit ( ). On a miss, it checks the flash cache ( ), issuing a read to the underlying storage devices ( ); if the object is found, it returns the result ( ). A flash cache hit promotes the item to DRAM and may evict an existing DRAM entry ( , ). A miss in the flash cache ( ) leads to a back-end access ( ).
  • Figure 4: Random Read-only
  • Figure 5: Random Write-only
  • ...and 16 more figures