Table of Contents
Fetching ...

Pushing the Limits of In-Network Caching for Key-Value Stores

Gyuyeong Kim

TL;DR

This work tackles load balancing for distributed key-value stores under skew by enabling in-network caching of variable-length items. It introduces OrbitCache, which circulating cache packets in the switch data plane using packet recirculation and cloning to cache both keys and values without being constrained by hardware item-size limits, while a control plane updates hot keys based on server reports and local popularity counters. The design includes a hash-based cache lookup, a circular-queue request buffer, client-side hash collision resolution, and an invalidation-based coherence protocol, achieving high throughput and robustness across diverse workloads and changing conditions. The evaluation on an 8-node testbed shows OrbitCache delivering significantly higher throughput and balanced server loads compared with NoCache and NetCache, with manageable latency overhead and good adaptability to dynamic workloads. Overall, OrbitCache demonstrates the feasibility and potential of variable-length, in-network caching to boost performance for distributed KV stores and informs future directions for programmable-switch architectures.

Abstract

We present OrbitCache, a new in-network caching architecture that can cache variable-length items to balance a wide range of key-value workloads. Unlike existing works, OrbitCache does not cache hot items in the switch memory. Instead, we make hot items revisit the switch data plane continuously by exploiting packet recirculation. Our approach keeps cached key-value pairs in the switch data plane while freeing them from item size limitations caused by hardware constraints. We implement an OrbitCache prototype on an Intel Tofino switch. Our experimental results show that OrbitCache can balance highly skewed workloads and is robust to various system conditions.

Pushing the Limits of In-Network Caching for Key-Value Stores

TL;DR

This work tackles load balancing for distributed key-value stores under skew by enabling in-network caching of variable-length items. It introduces OrbitCache, which circulating cache packets in the switch data plane using packet recirculation and cloning to cache both keys and values without being constrained by hardware item-size limits, while a control plane updates hot keys based on server reports and local popularity counters. The design includes a hash-based cache lookup, a circular-queue request buffer, client-side hash collision resolution, and an invalidation-based coherence protocol, achieving high throughput and robustness across diverse workloads and changing conditions. The evaluation on an 8-node testbed shows OrbitCache delivering significantly higher throughput and balanced server loads compared with NoCache and NetCache, with manageable latency overhead and good adaptability to dynamic workloads. Overall, OrbitCache demonstrates the feasibility and potential of variable-length, in-network caching to boost performance for distributed KV stores and informs future directions for programmable-switch architectures.

Abstract

We present OrbitCache, a new in-network caching architecture that can cache variable-length items to balance a wide range of key-value workloads. Unlike existing works, OrbitCache does not cache hot items in the switch memory. Instead, we make hot items revisit the switch data plane continuously by exploiting packet recirculation. Our approach keeps cached key-value pairs in the switch data plane while freeing them from item size limitations caused by hardware constraints. We implement an OrbitCache prototype on an Intel Tofino switch. Our experimental results show that OrbitCache can balance highly skewed workloads and is robust to various system conditions.
Paper Structure (22 sections, 18 figures)

This paper contains 22 sections, 18 figures.

Figures (18)

  • Figure 1: Comparison of the high-level idea with the NetCache architecture. In OrbitCache, clients submit requests, and then circulating cache packets read request metadata. Since both keys and values are in cache packets, hardware constraints do not limit the item size. For variable-length keys, OrbitCache uses key hashes while handling hash collisions at the client.
  • Figure 2: OrbitCache architecture.
  • Figure 3: OrbitCache packet format.
  • Figure 4: Request processing. (a) the switch drops the request after inserting request metadata into the request table; (b) If a circulating cache packet reads request metadata, the switch clones the packet so that the original packet is forwarded to the client and the cloned one is recirculated again for further serving; (c) the switch invalidates the item to avoid inconsistent reads if a write request is for a cached item; (d) upon receiving a write reply for a cached item, the switch validates the item. After that, the switch clones the packet. The cloned packet is processed as a read reply after updating the operation type.
  • Figure 5: An example of queueing operations in the request table. Read requests enqueue new metadata and cache packets dequeue the stored metadata.
  • ...and 13 more figures