Table of Contents
Fetching ...

Got Root? A Linux Priv-Esc Benchmark

Andreas Happe, Jürgen Cito

TL;DR

The paper addresses the need for a reproducible Linux privilege-escalation benchmark to evaluate attacker capabilities and tooling in controlled environments. It proposes a locally runnable, air-gapped benchmark implemented with Vagrant and Ansible on Debian, featuring single-vulnerability-per-VM test cases across classes such as SUID/sudo misconfigurations, Docker privilege opportunities, information disclosure, and cron-based exploits, with kernel exploits and certain other vectors intentionally omitted for stability and relevance. Each test-case is mapped to MITRE ATT&CK techniques to provide a structured attack-path dataset, and optional hints simulate human checklists to reflect realistic attacker workflows. The authors discuss the predominance of enumeration over exploitation, the mix of single- and multi-step exploits, and the role of automation in facilitating efficient assessment. The benchmark is released as open-source to enable reproducible evaluations and community-driven expansion, aiming to strengthen defender capabilities and tooling for Linux privilege escalation scenarios.

Abstract

Linux systems are integral to the infrastructure of modern computing environments, necessitating robust security measures to prevent unauthorized access. Privilege escalation attacks represent a significant threat, typically allowing attackers to elevate their privileges from an initial low-privilege account to the all-powerful root account. A benchmark set of vulnerable systems is of high importance to evaluate the effectiveness of privilege-escalation techniques performed by both humans and automated tooling. Analyzing their behavior allows defenders to better fortify their entrusted Linux systems and thus protect their infrastructure from potentially devastating attacks. To address this gap, we developed a comprehensive benchmark for Linux privilege escalation. It provides a standardized platform to evaluate and compare the performance of human and synthetic actors, e.g., hacking scripts or automated tooling.

Got Root? A Linux Priv-Esc Benchmark

TL;DR

The paper addresses the need for a reproducible Linux privilege-escalation benchmark to evaluate attacker capabilities and tooling in controlled environments. It proposes a locally runnable, air-gapped benchmark implemented with Vagrant and Ansible on Debian, featuring single-vulnerability-per-VM test cases across classes such as SUID/sudo misconfigurations, Docker privilege opportunities, information disclosure, and cron-based exploits, with kernel exploits and certain other vectors intentionally omitted for stability and relevance. Each test-case is mapped to MITRE ATT&CK techniques to provide a structured attack-path dataset, and optional hints simulate human checklists to reflect realistic attacker workflows. The authors discuss the predominance of enumeration over exploitation, the mix of single- and multi-step exploits, and the role of automation in facilitating efficient assessment. The benchmark is released as open-source to enable reproducible evaluations and community-driven expansion, aiming to strengthen defender capabilities and tooling for Linux privilege escalation scenarios.

Abstract

Linux systems are integral to the infrastructure of modern computing environments, necessitating robust security measures to prevent unauthorized access. Privilege escalation attacks represent a significant threat, typically allowing attackers to elevate their privileges from an initial low-privilege account to the all-powerful root account. A benchmark set of vulnerable systems is of high importance to evaluate the effectiveness of privilege-escalation techniques performed by both humans and automated tooling. Analyzing their behavior allows defenders to better fortify their entrusted Linux systems and thus protect their infrastructure from potentially devastating attacks. To address this gap, we developed a comprehensive benchmark for Linux privilege escalation. It provides a standardized platform to evaluate and compare the performance of human and synthetic actors, e.g., hacking scripts or automated tooling.
Paper Structure (11 sections, 4 tables)