Table of Contents
Fetching ...

Zero-consistency root emulation for unprivileged container image build

Reid Priedhorsky, Michael Jennings, Megan Phinney

TL;DR

The paper addresses the challenge of building container images for HPC entirely in unprivileged environments, where traditional package managers expect privileged operations. It introduces a lightweight zero-consistency root emulation using a seccomp filter that makes privileged system calls appear to succeed without actually executing them, avoiding complex stateful emulation. This method enables the majority of Dockerfile builds in Type III containers and offers advantages in overhead, simplicity, and portability compared to full emulation approaches, though some edge cases (such as certain systemd scripts or unminimize) remain problematic. The work demonstrates a practical path toward fully unprivileged image builds in HPC workflows and outlines directions for expanding syscall coverage and adding selective consistency, with attention to performance and compatibility implications.

Abstract

Do Linux distribution package managers need the privileged operations they request to actually happen? Apparently not, at least for building container images for HPC applications. We use this observation to implement a root emulation mode using a Linux seccomp filter that intercepts some privileged system calls, does nothing, and returns success to the calling program. This approach provides no consistency whatsoever but appears sufficient to build all Dockerfiles we examined, simplifying fully-unprivileged workflows needed for HPC application containers.

Zero-consistency root emulation for unprivileged container image build

TL;DR

The paper addresses the challenge of building container images for HPC entirely in unprivileged environments, where traditional package managers expect privileged operations. It introduces a lightweight zero-consistency root emulation using a seccomp filter that makes privileged system calls appear to succeed without actually executing them, avoiding complex stateful emulation. This method enables the majority of Dockerfile builds in Type III containers and offers advantages in overhead, simplicity, and portability compared to full emulation approaches, though some edge cases (such as certain systemd scripts or unminimize) remain problematic. The work demonstrates a practical path toward fully unprivileged image builds in HPC workflows and outlines directions for expanding syscall coverage and adding selective consistency, with attention to performance and compatibility implications.

Abstract

Do Linux distribution package managers need the privileged operations they request to actually happen? Apparently not, at least for building container images for HPC applications. We use this observation to implement a root emulation mode using a Linux seccomp filter that intercepts some privileged system calls, does nothing, and returns success to the calling program. This approach provides no consistency whatsoever but appears sufficient to build all Dockerfiles we examined, simplifying fully-unprivileged workflows needed for HPC application containers.
Paper Structure (9 sections, 2 figures)

This paper contains 9 sections, 2 figures.

Figures (2)

  • Figure 1: Example Dockerfiles built with a Type III (fully unprivileged) implementation and no root emulation. \ref{['fig:naive-win']} succeeded because no privileged system calls were used, while \ref{['fig:naive-fail']} failed because rpm(8) tried to change a file’s owner, a privileged operation.
  • Figure 2: Successful seccomp root-emulation build of the Dockerfile in Figure \ref{['fig:naive-fail']} above.