Table of Contents
Fetching ...

2DIO: A Cache-Accurate Storage Microbenchmark

Yirong Wang, Isaac Khor, Peter Desnoyers

Abstract

We introduce 2DIO, a microbenchmark creating cache-accurate, stressful I/O traces. While existing tools are limited to generating traces with well-behaved, concave hit ratio curves, 2DIO produces ones with tunable complex cache behaviors, particularly performance cliffs and plateaus. Our framework encodes a workload as a compact parameter triplet, capturing both short-term recency and long-term frequency. This parsimonious parameterization allows researchers to easily translate individual adjustments into predictable cache effects across various eviction policies, and enables the parameter space to be "swept" for exhaustive exploration of desired cache behavior, or to mimic real traces by calibrating parameters to match observed behaviors. The tuned parameters are portable, meaning if the scale of the system under evaluation changes, so too will the footprint and length of the trace, while the relative cache behaviors are preserved. Evaluations demonstrate 2DIO's ability to generate traces across a continuum of "what-if" cache behaviors and to reproduce real-world ones with high accuracy.

2DIO: A Cache-Accurate Storage Microbenchmark

Abstract

We introduce 2DIO, a microbenchmark creating cache-accurate, stressful I/O traces. While existing tools are limited to generating traces with well-behaved, concave hit ratio curves, 2DIO produces ones with tunable complex cache behaviors, particularly performance cliffs and plateaus. Our framework encodes a workload as a compact parameter triplet, capturing both short-term recency and long-term frequency. This parsimonious parameterization allows researchers to easily translate individual adjustments into predictable cache effects across various eviction policies, and enables the parameter space to be "swept" for exhaustive exploration of desired cache behavior, or to mimic real traces by calibrating parameters to match observed behaviors. The tuned parameters are portable, meaning if the scale of the system under evaluation changes, so too will the footprint and length of the trace, while the relative cache behaviors are preserved. Evaluations demonstrate 2DIO's ability to generate traces across a continuum of "what-if" cache behaviors and to reproduce real-world ones with high accuracy.
Paper Structure (27 sections, 10 equations, 11 figures, 6 tables, 2 algorithms)

This paper contains 27 sections, 10 equations, 11 figures, 6 tables, 2 algorithms.

Figures (11)

  • Figure 1: Several CloudPhysics traces showing diverse hit rate behavior. Cache size is normalized to the trace footprint, i.e. the total number of unique blocks accessed in the trace.
  • Figure 3: LRU HRCs for CloudPhysics w44: original trace (blue), 2DIO-generated trace (orange), and trace reconstructed using empirically-measured item frequency distribution (green). Cache size is measured in number of blocks.
  • Figure 4: LRU HRCs and IRD histograms of real traces from AliCloud and CloudPhysics.
  • Figure 5: IRD + IRM trace generation uses Algorithm \ref{['alg:gen_from_both']}: references are drawn with probability $P_{IRM}$ from an IRM process, and $1-P_{IRM}$ from an IRD renewal process.
  • Figure 6: correspondences between HRC plateaus & cliffs and IRD holes & spikes. Example synthetic traces Trace A and Trace B: LRU HRC (left), and corresponding IRD distributions (right).
  • ...and 6 more figures