Table of Contents
Fetching ...

Optimizing CPU Cache Utilization in Cloud VMs with Accurate Cache Abstraction

Mani Tofigh, Edward Guo, Weiwei Jia, Xiaoning Ding, Zirui Neil Zhao, Jianchen Shan

TL;DR

The paper addresses the lack of visibility into vCache provisioning in cloud VMs and its negative impact on cache-based optimizations. It introduces CacheX, an in-VM probing framework using eviction sets to reveal fine-grained vCache properties and per-color contention without hardware or hypervisor support, enabling two techniques: LLC contention-aware task scheduling (cas) and virtual color-aware page cache management (cap). The approach demonstrates accurate identification of eviction sets and colors, low monitoring overhead, and tangible improvements in throughput and latency across cloud workloads, with validation in local and public-cloud VMs. The results suggest that in-VM cache-awareness can significantly mitigate inter- and intra-VM interference while remaining low-cost and deployable in multi-cloud environments.

Abstract

This paper shows that cache-based optimizations are often ineffective in cloud virtual machines (VMs) due to limited visibility into and control over provisioned caches. In public clouds, CPU caches can be partitioned or shared among VMs, but a VM is unaware of cache provisioning details. Moreover, a VM cannot influence cache usage via page placement policies, as memory-to-cache mappings are hidden. The paper proposes a novel solution, CacheX, which probes accurate and fine-grained cache abstraction within VMs using eviction sets without requiring hardware or hypervisor support, and showcases the utility of the probed information with two new techniques: LLC contention-aware task scheduling and virtual color-aware page cache management. Our evaluation of CacheX's implementation in x86 Linux kernel demonstrates that it can effectively improve cache utilization for various workloads in public cloud VMs.

Optimizing CPU Cache Utilization in Cloud VMs with Accurate Cache Abstraction

TL;DR

The paper addresses the lack of visibility into vCache provisioning in cloud VMs and its negative impact on cache-based optimizations. It introduces CacheX, an in-VM probing framework using eviction sets to reveal fine-grained vCache properties and per-color contention without hardware or hypervisor support, enabling two techniques: LLC contention-aware task scheduling (cas) and virtual color-aware page cache management (cap). The approach demonstrates accurate identification of eviction sets and colors, low monitoring overhead, and tangible improvements in throughput and latency across cloud workloads, with validation in local and public-cloud VMs. The results suggest that in-VM cache-awareness can significantly mitigate inter- and intra-VM interference while remaining low-cost and deployable in multi-cloud environments.

Abstract

This paper shows that cache-based optimizations are often ineffective in cloud virtual machines (VMs) due to limited visibility into and control over provisioned caches. In public clouds, CPU caches can be partitioned or shared among VMs, but a VM is unaware of cache provisioning details. Moreover, a VM cannot influence cache usage via page placement policies, as memory-to-cache mappings are hidden. The paper proposes a novel solution, CacheX, which probes accurate and fine-grained cache abstraction within VMs using eviction sets without requiring hardware or hypervisor support, and showcases the utility of the probed information with two new techniques: LLC contention-aware task scheduling and virtual color-aware page cache management. Our evaluation of CacheX's implementation in x86 Linux kernel demonstrates that it can effectively improve cache utilization for various workloads in public cloud VMs.

Paper Structure

This paper contains 20 sections, 12 figures, 6 tables.

Figures (12)

  • Figure 1: Mapping addresses to Skylake-SP’s L2 and LLC.
  • Figure 2: The impact of counterproductive cache affinity. In (a), the metrics are normalized to EEVDF, and the first two bars represent the metrics under EEVDF, while the last two bars depict those when threads are pinned. In (b), LLC 0 is heavily contended.
  • Figure 3: (a) Slowdowns caused by cache polluter relative to solo runs. (b) GPA-derived color's distribution in HPA-derived colors.
  • Figure 4: Avoidable inter-VM cache conflicts under asymmetric contention across LLC sets.
  • Figure 5:
  • ...and 7 more figures