Table of Contents
Fetching ...

Valet: Efficient Data Placement on Modern SSDs

Devashish R. Purandare, Peter Alvaro, Avani Wildani, Darrell D. E. Long, Ethan L. Miller

TL;DR

Valet tackles the challenge of efficiently exploiting modern SSD interfaces for log-structured workloads by introducing a userspace shim layer that intercepts application I/O, generates dynamic placement hints, and remaps data to devices without requiring changes to applications, filesystems, or the kernel. Its three-part architecture—application call interception, a pluggable data placement engine, and the valet-mapper device manager—supports both hint-based and host-managed interfaces, including ZNS and kernel hints. Across RocksDB, MongoDB, and CacheLib, Valet delivers up to 2–4x higher write throughput and up to 6x lower tail latency, while improving multi-tenant isolation and reducing write amplification and storage-system complexity. The approach offers broad applicability, extensibility through heuristic and learning-based hints, and a path toward widespread adoption of advanced SSD interfaces without imposing kernel or application rewrites.

Abstract

The increasing demand for SSDs coupled with scaling difficulties has left manufacturers scrambling for newer SSD interfaces which promise better performance and durability. While these interfaces reduce the rigidity of traditional abstractions, they require application or system-level changes that can impact the stability, security, and portability of systems. To make matters worse, such changes are rendered futile with the introduction of next-generation interfaces. It is therefore no surprise that such interfaces have seen limited adoption, leaving behind a graveyard of experimental interfaces ranging from open-channel SSDs to stream SSDs. Our solution, Valet, leverages userspace shim layers to add placement hints for application data, delivering up to 2-4x write throughput over filesystems and comparable or better performance than application-specific solutions, with up to 6x lower tail latency. Valet generates dynamic placement hints, remapping application data to modern SSDs with zero modifications to the application, the filesystem, or the kernel. We demonstrate performance, efficiency, and multi-tenancy benefits of Valet across a set of widely-used applications: RocksDB, MongoDB, and CacheLib, presenting a solution that combines the performance of application-specific solutions with wide applicability to log-structured data-intensive applications.

Valet: Efficient Data Placement on Modern SSDs

TL;DR

Valet tackles the challenge of efficiently exploiting modern SSD interfaces for log-structured workloads by introducing a userspace shim layer that intercepts application I/O, generates dynamic placement hints, and remaps data to devices without requiring changes to applications, filesystems, or the kernel. Its three-part architecture—application call interception, a pluggable data placement engine, and the valet-mapper device manager—supports both hint-based and host-managed interfaces, including ZNS and kernel hints. Across RocksDB, MongoDB, and CacheLib, Valet delivers up to 2–4x higher write throughput and up to 6x lower tail latency, while improving multi-tenant isolation and reducing write amplification and storage-system complexity. The approach offers broad applicability, extensibility through heuristic and learning-based hints, and a path toward widespread adoption of advanced SSD interfaces without imposing kernel or application rewrites.

Abstract

The increasing demand for SSDs coupled with scaling difficulties has left manufacturers scrambling for newer SSD interfaces which promise better performance and durability. While these interfaces reduce the rigidity of traditional abstractions, they require application or system-level changes that can impact the stability, security, and portability of systems. To make matters worse, such changes are rendered futile with the introduction of next-generation interfaces. It is therefore no surprise that such interfaces have seen limited adoption, leaving behind a graveyard of experimental interfaces ranging from open-channel SSDs to stream SSDs. Our solution, Valet, leverages userspace shim layers to add placement hints for application data, delivering up to 2-4x write throughput over filesystems and comparable or better performance than application-specific solutions, with up to 6x lower tail latency. Valet generates dynamic placement hints, remapping application data to modern SSDs with zero modifications to the application, the filesystem, or the kernel. We demonstrate performance, efficiency, and multi-tenancy benefits of Valet across a set of widely-used applications: RocksDB, MongoDB, and CacheLib, presenting a solution that combines the performance of application-specific solutions with wide applicability to log-structured data-intensive applications.
Paper Structure (28 sections, 11 figures, 4 tables)

This paper contains 28 sections, 11 figures, 4 tables.

Figures (11)

  • Figure 1: Writes to zonefs can get the full bandwidth while f2fs sees degradation in both latency and throughput.
  • Figure 2: CPU time breakdown shows zonefs yields 72% of CPU time back while f2fs only yields 49%, with 34% spent on synchronization overhead.
  • Figure 3: Simplified Valet architecture: Valet intercepts application calls, generates placement plans, resolves them to a particular protocol and finally manages data placement.
  • Figure 4: Valet hints are organized based on affinity (streams) and lifetime (groups).
  • Figure 5: Throughput for db_bench workloads.
  • ...and 6 more figures