DiFache: Efficient and Scalable Caching on Disaggregated Memory using Decentralized Coherence

Hanze Zhang; Kaiming Wang; Rong Chen; Xingda Wei; Haibo Chen

DiFache: Efficient and Scalable Caching on Disaggregated Memory using Decentralized Coherence

Hanze Zhang, Kaiming Wang, Rong Chen, Xingda Wei, Haibo Chen

TL;DR

The paper tackles the scalability bottleneck of cache coherence in disaggregated memory systems by introducing DiFache, a decentralized CN-side caching framework. It replaces a centralized coherence manager with per-object, cross-CN invalidation and adaptive caching driven by real-time profits, using a Hopscotch-based cache index and atomic owner tracking. Real-world traces and applications demonstrate substantial throughput and latency improvements, with up to $10.83\times$ speedups and significant end-to-end application performance gains. The work offers a practical path toward scalable, coherent CN-side caching in DM and discusses future hardware-coherence opportunities on CXL-based platforms.

Abstract

The disaggregated memory (DM) architecture offers high resource elasticity at the cost of data access performance. While caching frequently accessed data in compute nodes (CNs) reduces access overhead, it requires costly centralized maintenance of cache coherence across CNs. This paper presents DiFache, an efficient, scalable, and coherent CN-side caching framework for DM applications. Observing that DM applications already serialize conflicting remote data access internally rather than relying on the cache layer, DiFache introduces decentralized coherence that aligns its consistency model with memory nodes instead of CPU caches, thereby eliminating the need for centralized management. DiFache features a decentralized invalidation mechanism to independently invalidate caches on remote CNs and a fine-grained adaptive scheme to cache objects with varying read-write ratios. Evaluations using 54 real-world traces from Twitter show that DiFache outperforms existing approaches by up to 10.83$\times$ (5.53$\times$ on average). By integrating DiFache, the peak throughput of two real-world DM applications increases by 7.94$\times$ and 2.19$\times$, respectively.

DiFache: Efficient and Scalable Caching on Disaggregated Memory using Decentralized Coherence

TL;DR

Abstract

DiFache: Efficient and Scalable Caching on Disaggregated Memory using Decentralized Coherence

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (15)