Squeezy: Rapid VM Memory Reclamation for Serverless Functions
Orestis Lagkas Nikolos, Chloe Alverti, Stratos Psomadakis, Georgios Goumas, Nectarios Koziris
TL;DR
The paper tackles memory elasticity for N:1 FaaS VMs by exposing and exploiting the predictability of function memory footprints. It introduces Squeezy, an OS-level extension that partitions guest memory into fixed chunks per function and a shared partition, enabling fast, migration-free memory reclamation during function termination. Implemented in Linux 6.6 and integrated with an OpenWhisk-based runtime, Squeezy delivers sub-second reclamation of multiple GiB, maintains low CPU overhead, and keeps tail latency bounded under realistic serverless workloads. Compared with baseline approaches and 1:1 microVMs, Squeezy reduces memory waste and improves end-to-end performance, making memory elasticity viable at sub-second timescales for serverless deployments.
Abstract
Resource elasticity is one of the key defining characteristics of the Function-as-a-Service (FaaS) serverless computing paradigm. While compute resources assigned to VM-sandboxed functions can be seamlessly adjusted on the fly, memory elasticity remains challenging. Hot(un)plugging memory resources suffers from long reclamation latencies and occupies valuable CPU resources. We identify the obliviousness of the OS memory manager to the hotplugged memory as the key issue hindering hot-unplug performance, and design Squeezy, a novel approach for fast and efficient VM memory hot(un)plug, targeting VM-sandboxed serverless functions. Our key insight is that by segregating hotplugged memory regions from regular VM memory, we are able to bound the lifetime of allocations within these regions thus enabling their fast and efficient reclamation. We implement Squeezy in Linux v6.6 as an extension to the OS memory manager. Our evaluation reveals that Squeezy is an order-of-magnitude faster than state-of-the-art, keeping tail latency bounded, when reclaiming VM memory, achieving sub-second reclamation of multiple GiBs of memory while serving realistic FaaS load.
