Table of Contents
Fetching ...

Foreactor: Exploiting Storage I/O Parallelism with Explicit Speculation

Guanzhou Hu, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau

TL;DR

The paper addresses underutilization of storage I/O parallelism in I/O-heavy applications by introducing explicit speculation, a deterministic approach guided by foreaction graphs that describe I/O patterns and argument computations. It presents Foreactor, a library that intercepts I/O calls and uses a pre-issuing algorithm with either io_uring or a user-thread backend to parallelize I/O with minimal source-code changes. The foreaction-graph abstraction provides formal guarantees and supports case studies across regular I/O loops, B+-tree operations, and LSM-tree searches, achieving significant speedups (up to 50% in du, 37% in B+-tree, and 34% in LevelDB Get). The work demonstrates that explicit speculation can exploit SSD parallelism without checkpointing or heavy prediction, offering practical benefits for a wide range of I/O-intensive applications.

Abstract

We introduce explicit speculation, a variant of I/O speculation technique where I/O system calls can be parallelized under the guidance of explicit application code knowledge. We propose a formal abstraction -- the foreaction graph -- which describes the exact pattern of I/O system calls in an application function as well as any necessary computation associated to produce their argument values. I/O system calls can be issued ahead of time if the graph says it is safe and beneficial to do so. With explicit speculation, serial applications can exploit storage I/O parallelism without involving expensive prediction or checkpointing mechanisms. Based on explicit speculation, we implement Foreactor, a library framework that allows application developers to concretize foreaction graphs and enable concurrent I/O with little or no modification to application source code. Experimental results show that Foreactor is able to improve the performance of both synthetic benchmarks and real applications by significant amounts (29%-50%).

Foreactor: Exploiting Storage I/O Parallelism with Explicit Speculation

TL;DR

The paper addresses underutilization of storage I/O parallelism in I/O-heavy applications by introducing explicit speculation, a deterministic approach guided by foreaction graphs that describe I/O patterns and argument computations. It presents Foreactor, a library that intercepts I/O calls and uses a pre-issuing algorithm with either io_uring or a user-thread backend to parallelize I/O with minimal source-code changes. The foreaction-graph abstraction provides formal guarantees and supports case studies across regular I/O loops, B+-tree operations, and LSM-tree searches, achieving significant speedups (up to 50% in du, 37% in B+-tree, and 34% in LevelDB Get). The work demonstrates that explicit speculation can exploit SSD parallelism without checkpointing or heavy prediction, offering practical benefits for a wide range of I/O-intensive applications.

Abstract

We introduce explicit speculation, a variant of I/O speculation technique where I/O system calls can be parallelized under the guidance of explicit application code knowledge. We propose a formal abstraction -- the foreaction graph -- which describes the exact pattern of I/O system calls in an application function as well as any necessary computation associated to produce their argument values. I/O system calls can be issued ahead of time if the graph says it is safe and beneficial to do so. With explicit speculation, serial applications can exploit storage I/O parallelism without involving expensive prediction or checkpointing mechanisms. Based on explicit speculation, we implement Foreactor, a library framework that allows application developers to concretize foreaction graphs and enable concurrent I/O with little or no modification to application source code. Experimental results show that Foreactor is able to improve the performance of both synthetic benchmarks and real applications by significant amounts (29%-50%).
Paper Structure (26 sections, 9 figures, 1 table, 1 algorithm)

This paper contains 26 sections, 9 figures, 1 table, 1 algorithm.

Figures (9)

  • Figure 1: Throughput vs. I/O Concurrency on single NVMe SSD.Dashed blue line shows sequential access steady throughput; solid lines show random mixed read-write throughput with different request sizes.
  • Figure 2: Demonstration of I/O Speculation Techniques.Gray boxes represent computation. Blue boxes represent I/O requests. Dashed border means speculative content. Lightning symbol represents the return point of application function. FG stands for foreground application thread. BG stands for background I/O threads.
  • Figure 4: Representative Application Foreaction Graphs.\ref{['fig:foreaction-graph-du']}, \ref{['fig:foreaction-graph-cp']}, and \ref{['fig:foreaction-graph-ldb']} shows the foreaction graphs for three common application workloads. \ref{['fig:foreaction-graph-ldb-timeline']} illustrates a possible execution timeline of \ref{['fig:foreaction-graph-ldb']} with explicit speculation.
  • Figure 5: Overview of Foreactor Architecture.
  • Figure 6: Throughput of du and cp Workloads.
  • ...and 4 more figures