Table of Contents
Fetching ...

Decomposing Docker Container Startup Performance: A Three-Tier Measurement Study on Heterogeneous Infrastructure

Shamsher Khan

TL;DR

A systematic measurement study that decomposes Docker container startup into constituent operations across three heterogeneous infrastructure tiers and quantifies previously under-characterized relationships between infrastructure configuration and container runtime behavior.

Abstract

Container startup latency is a critical performance metric for CI/CD pipelines, serverless computing, and auto-scaling systems, yet practitioners lack empirical guidance on how infrastructure choices affect this latency. We present a systematic measurement study that decomposes Docker container startup into constituent operations across three heterogeneous infrastructure tiers: Azure Premium SSD (cloud SSD), Azure Standard HDD (cloud HDD), and macOS Docker Desktop (developer workstation with hypervisor-based virtualization). Using a reproducible benchmark suite that executes 50 iterations per test across 10 performance dimensions, we quantify previously under-characterized relationships between infrastructure configuration and container runtime behavior. Our key findings include: (1) container startup is dominated by runtime overhead rather than image size, with only 2.5% startup variation across images ranging from 5 MB to 155 MB on SSD; (2) storage tier selection imposes a 2.04x startup penalty (HDD 1157 ms vs. SSD 568 ms); (3) Docker Desktop's hypervisor layer introduces a 2.69x startup penalty and 9.5x higher CPU throttling variance compared to native Linux; (4) OverlayFS write performance collapses by up to two orders of magnitude compared to volume mounts on SSD-backed storage; and (5) Linux namespace creation contributes only 8-10 ms (<1.5%) of total startup time. All measurement scripts, raw data, and analysis tools are publicly available.

Decomposing Docker Container Startup Performance: A Three-Tier Measurement Study on Heterogeneous Infrastructure

TL;DR

A systematic measurement study that decomposes Docker container startup into constituent operations across three heterogeneous infrastructure tiers and quantifies previously under-characterized relationships between infrastructure configuration and container runtime behavior.

Abstract

Container startup latency is a critical performance metric for CI/CD pipelines, serverless computing, and auto-scaling systems, yet practitioners lack empirical guidance on how infrastructure choices affect this latency. We present a systematic measurement study that decomposes Docker container startup into constituent operations across three heterogeneous infrastructure tiers: Azure Premium SSD (cloud SSD), Azure Standard HDD (cloud HDD), and macOS Docker Desktop (developer workstation with hypervisor-based virtualization). Using a reproducible benchmark suite that executes 50 iterations per test across 10 performance dimensions, we quantify previously under-characterized relationships between infrastructure configuration and container runtime behavior. Our key findings include: (1) container startup is dominated by runtime overhead rather than image size, with only 2.5% startup variation across images ranging from 5 MB to 155 MB on SSD; (2) storage tier selection imposes a 2.04x startup penalty (HDD 1157 ms vs. SSD 568 ms); (3) Docker Desktop's hypervisor layer introduces a 2.69x startup penalty and 9.5x higher CPU throttling variance compared to native Linux; (4) OverlayFS write performance collapses by up to two orders of magnitude compared to volume mounts on SSD-backed storage; and (5) Linux namespace creation contributes only 8-10 ms (<1.5%) of total startup time. All measurement scripts, raw data, and analysis tools are publicly available.
Paper Structure (32 sections, 2 equations, 3 figures, 4 tables)

This paper contains 32 sections, 2 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Warm-start latency comparison across platforms and images. Error bars show 95% confidence intervals. The $2.04\times$ HDD penalty and $2.69\times$ Docker Desktop penalty are visible across all image sizes.
  • Figure 2: CPU throttling accuracy under --cpus=0.5 (target: 50%). Box plots show median, IQR, and outliers. macOS exhibits $9.5\times$ higher variance with outliers reaching 247%, demonstrating compounded CFS scheduling inaccuracy through two-level virtualization.
  • Figure 3: Write throughput: OverlayFS vs. volume mounts. On Azure, OverlayFS collapses to $0.006\text{--}0.010\times$ of volume speed due to copy-up overhead. On macOS, the difference is not significant ($p\!=\!0.21$) because the virtio layer already serializes I/O.