LLMs as Packagers of HPC Software
Caetano Melone, Daniel Nichols, Konstantinos Parasyris, Todd Gamblin, Harshitha Menon
TL;DR
This work addresses the challenge of automating Spack recipe generation for HPC software, a domain characterized by heterogeneous build systems and complex dependency graphs. It introduces SpackIt, an agentic framework that couples repository analysis, retrieval of relevant examples, and an iterative self-repair loop guided by diagnostic feedback to generate valid Spack recipes. Through a large-scale study on 308 E4S CMake-based HPC packages, SpackIt raises installation success from 19.7% in zero-shot settings to approximately 83% in the best configuration, demonstrating the value of retrieval-augmented context and structured feedback for reliable package synthesis. The approach advances reproducibility and efficiency in HPC software packaging by grounding model reasoning in repository metadata and domain-specific conventions, and it provides a replication package to support further research.
Abstract
High performance computing (HPC) software ecosystems are inherently heterogeneous, comprising scientific applications that depend on hundreds of external packages, each with distinct build systems, options, and dependency constraints. Tools such as Spack automate dependency resolution and environment management, but their effectiveness relies on manually written build recipes. As these ecosystems grow, maintaining existing specifications and creating new ones becomes increasingly labor-intensive. While large language models (LLMs) have shown promise in code generation, automatically producing correct and maintainable Spack recipes remains a significant challenge. We present a systematic analysis of how LLMs and context-augmentation methods can assist in the generation of Spack recipes. To this end, we introduce SpackIt, an end-to-end framework that combines repository analysis, retrieval of relevant examples, and iterative refinement through diagnostic feedback. We apply SpackIt to a representative subset of 308 open-source HPC packages to assess its effectiveness and limitations. Our results show that SpackIt increases installation success from 20% in a zero-shot setting to over 80% in its best configuration, demonstrating the value of retrieval and structured feedback for reliable package synthesis.
