Table of Contents
Fetching ...

Flexible Swapping for the Cloud

Milan Pandurov, Lukas Humbel, Dmitry Sepp, Adamos Ttofari, Leon Thomm, Do Le Quoc, Siddharth Chandrasekaran, Sharan Santhanam, Chuan Ye, Shai Bergman, Wei Wang, Sven Lundgren, Konstantinos Sagonas, Alberto Ros

TL;DR

The paper addresses memory overcommit challenges in cloud data centers, where VM memory is often underutilized. It introduces a flexible, userspace memory management framework with per-VM Memory Managers, a policy API, and a SPDK-backed storage backend to reclaim cold memory using strict hugepage swapping. Key contributions include the architecture (MM, policies, storage backend), VM introspection for policy guidance, kernel-assisted EPT scanning, and zero-copy I/O support, with evaluation showing up to 25% performance gains and substantial memory savings, plus additional gains from workload-specific policies. This approach promises significant practical impact by improving memory utilization and reducing cloud operating costs through VM-aware overcommit and flexible reclaim strategies.

Abstract

Memory has become the primary cost driver in cloud data centers. Yet, a significant portion of memory allocated to VMs in public clouds remains unused. To optimize this resource, "cold" memory can be reclaimed from VMs and stored on slower storage or compressed, enabling memory overcommit. Current overcommit systems rely on general-purpose OS swap mechanisms, which are not optimized for virtualized workloads, leading to missed memory-saving opportunities and ineffective use of optimizations like prefetchers. This paper introduces a userspace memory management framework designed for VMs. It enables custom policies that have full control over the virtual machines' memory using a simple userspace API, supports huge page-based swapping to satisfy VM performance requirements, is easy to deploy by leveraging Linux/KVM, and supports zero-copy I/O virtualization with shared VM memory. Our evaluation demonstrates that an overcommit system based on our framework outperforms the state-of-the-art solutions on both micro-benchmarks and commonly used cloud workloads. Specifically our implementation outperforms the Linux Kernel baseline implementation by up to 25% while saving a similar amount of memory. We also demonstrate the benefits of custom policies by implementing workload-specific reclaimers and prefetchers that save $10\%$ additional memory, improve performance in a limited memory scenario by 30% over the Linux baseline, and recover faster from hard limit releases.

Flexible Swapping for the Cloud

TL;DR

The paper addresses memory overcommit challenges in cloud data centers, where VM memory is often underutilized. It introduces a flexible, userspace memory management framework with per-VM Memory Managers, a policy API, and a SPDK-backed storage backend to reclaim cold memory using strict hugepage swapping. Key contributions include the architecture (MM, policies, storage backend), VM introspection for policy guidance, kernel-assisted EPT scanning, and zero-copy I/O support, with evaluation showing up to 25% performance gains and substantial memory savings, plus additional gains from workload-specific policies. This approach promises significant practical impact by improving memory utilization and reducing cloud operating costs through VM-aware overcommit and flexible reclaim strategies.

Abstract

Memory has become the primary cost driver in cloud data centers. Yet, a significant portion of memory allocated to VMs in public clouds remains unused. To optimize this resource, "cold" memory can be reclaimed from VMs and stored on slower storage or compressed, enabling memory overcommit. Current overcommit systems rely on general-purpose OS swap mechanisms, which are not optimized for virtualized workloads, leading to missed memory-saving opportunities and ineffective use of optimizations like prefetchers. This paper introduces a userspace memory management framework designed for VMs. It enables custom policies that have full control over the virtual machines' memory using a simple userspace API, supports huge page-based swapping to satisfy VM performance requirements, is easy to deploy by leveraging Linux/KVM, and supports zero-copy I/O virtualization with shared VM memory. Our evaluation demonstrates that an overcommit system based on our framework outperforms the state-of-the-art solutions on both micro-benchmarks and commonly used cloud workloads. Specifically our implementation outperforms the Linux Kernel baseline implementation by up to 25% while saving a similar amount of memory. We also demonstrate the benefits of custom policies by implementing workload-specific reclaimers and prefetchers that save additional memory, improve performance in a limited memory scenario by 30% over the Linux baseline, and recover faster from hard limit releases.
Paper Structure (29 sections, 13 figures, 1 table)

This paper contains 29 sections, 13 figures, 1 table.

Figures (13)

  • Figure 1: Average access latency (ns) with varying percentages of cold-page accesses
  • Figure 2: Page access pattern of a $50\%/50\%$ alternating workload measured directly (up) and under virtualization (down)
  • Figure 3: Direct (% CPU) and indirect (runtime) costs of increasing the EPT scan frequency with 4kB (up) and 2MB pages (down).
  • Figure 4: system overview
  • Figure 5: Swap-in (in black) and swap-out (in red) using shared memory
  • ...and 8 more figures