HerAgent: Rethinking the Automated Environment Deployment via Hierarchical Test Pyramid
Xiang Li, Siyu Lu, Sarro Federica, Claire Le Goues, He Ye
TL;DR
HerAgent redefines automated environment deployment by introducing the Environment Maturity Hierarchy, which distinguishes Installability, Testability, and Runnability and treats execution evidence as the true success signal. The approach builds executable environments through a three-stage pipeline (Initial Bash File Generation, Test Pyramid Construction, and Interactive Environment Deployment) and uses a dual-loop repair mechanism to incrementally validate and repair the Bash File, guided by a formal state transition policy over maturity levels. Across four benchmarks and 14 languages, HerAgent achieves state-of-the-art effectiveness, including unique resolutions on many instances and strong gains in challenging C/C++ projects; ablation studies highlight the necessity of interactive feedback and hybrid repair for reaching full runnability. The work provides a principled, executable, and generalizable framework for automated environment configuration with practical impact on reproducibility, agent-based software engineering, and cross-language project deployment.
Abstract
Automated software environment setup is a prerequisite for testing, debugging, and reproducing failures, yet remains challenging in practice due to complex dependencies, heterogeneous build systems, and incomplete documentation. Recent work leverages large language models to automate this process, but typically evaluates success using weak signals such as dependency installation or partial test execution, which do not ensure that a project can actually run. In this paper, we argue that environment setup success should be evaluated through executable evidence rather than a single binary signal. We introduce the Environment Maturity Hierarchy, which defines three success levels based on progressively stronger execution requirements, culminating in successful execution of a project's main entry point. Guided by this hierarchy, we propose HerAgent, an automated environment setup approach that incrementally constructs executable environments through execution-based validation and repair. We evaluate HerAgent on four public benchmarks, where it outperforms all related work, achieving up to 79.6\% improvement due to its holistic understanding of project structure and dependencies. On complex C/C++ projects, HerAgent surpasses prior approaches by 66.7\%. In addition, HerAgent uniquely resolves 11-30 environment instances across the benchmarks that no prior method can configure.
