SwitchDelta: Asynchronous Metadata Updating for Distributed Storage with In-Network Data Visibility

Junru Li; Qing Wang; Zhe Yang; Shuo Liu; Jiwu Shu; Youyou Lu

SwitchDelta: Asynchronous Metadata Updating for Distributed Storage with In-Network Data Visibility

Junru Li, Qing Wang, Zhe Yang, Shuo Liu, Jiwu Shu, Youyou Lu

TL;DR

Distributed storage systems with data/metadata separation must perform ordered writes to ensure linearizability, but this incurs latency and throughput overheads. SwitchΔ introduces an in-switch data visibility layer that buffers in-flight metadata updates and makes data visible immediately after the data write phase, while applying updates to metadata nodes asynchronously. The design combines a timestamped concurrency control, hash-based in-switch indexing, partial-write support, and batching techniques to achieve latency reductions and throughput gains across a log-structured KV store, a distributed file system, and a distributed secondary index, with robust failure handling. Empirical evaluations show median write latency reductions up to 52.4% and throughput improvements up to 126.9% under write-heavy workloads, along with practical deployment considerations in ToR-based data centers and resilience to switch failures.

Abstract

Distributed storage systems typically maintain strong consistency between data nodes and metadata nodes by adopting ordered writes: 1) first installing data; 2) then updating metadata to make data visible.We propose SwitchDelta to accelerate ordered writes by moving metadata updates out of the critical path. It buffers in-flight metadata updates in programmable switches to enable data visibility in the network and retain strong consistency. SwitchDelta uses a best-effort data plane design to overcome the resource limitation of switches and designs a novel metadata update protocol to exploit the benefits of in-network data visibility. We evaluate SwitchDelta in three distributed in-memory storage systems: log-structured key-value stores, file systems, and secondary indexes. The evaluation shows that SwitchDelta reduces the latency of write operations by up to 52.4% and boosts the throughput by up to 126.9% under write-heavy workloads.

SwitchDelta: Asynchronous Metadata Updating for Distributed Storage with In-Network Data Visibility

TL;DR

Abstract

SwitchDelta: Asynchronous Metadata Updating for Distributed Storage with In-Network Data Visibility

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (12)