Investigating the Impact of Isolation on Synchronized Benchmarks
Nils Japke, Furat Hamdan, Diana Baumann, David Bermbach
TL;DR
This work tackles cloud performance variability by using duet benchmarking to compare two SUT versions on the same VM, mitigating external interference. It introduces a noise generator and evaluates three isolation strategies—cgroups with CPU pinning, Docker containers, and Firecracker MicroVMs—against an unisolated baseline, using bootstrap confidence intervals and Wilcoxon tests to detect latency changes. The findings reveal that Docker containers exhibit greater susceptibility to noise and higher false positives, while cgroups/CPU pinning and Firecracker MicroVMs provide better isolation, with MicroVMs performing best overall. The study provides actionable guidance for selecting isolation techniques in synchronized benchmarks and offers replication artifacts to support broader adoption and validation in cloud performance benchmarking.
Abstract
Benchmarking in cloud environments suffers from performance variability from multi-tenant resource contention. Duet benchmarking mitigates this by running two workload versions concurrently on the same VM, exposing them to identical external interference. However, intra-VM contention between synchronized workloads necessitates additional isolation mechanisms. This work evaluates three such strategies: cgroups and CPU pinning, Docker containers, and Firecracker MicroVMs. We compare all strategies with an unisolated baseline experiment, by running benchmarks with a duet setup alongside a noise generator. This noise generator "steals" compute resources to degrade performance measurements. All experiments showed different latency distributions while under the effects of noise generation, but results show that process isolation generally lowered false positives, except for our experiments with Docker containers. Even though Docker containers rely internally on cgroups and CPU pinning, they were more susceptible to performance degradation due to noise influence. Therefore, we recommend to use process isolation for synchronized workloads, with the exception of Docker containers.
