HEPTAPOD: Orchestrating High Energy Physics Workflows Towards Autonomous Agency
Tony Menzo, Alexander Roman, Sergei Gleyzer, Konstantin Matchev, George T. Fleming, Stefan Höche, Stephen Mrenna, Prasanth Shyamsundar
TL;DR
This work addresses the challenge of coordinating complex, multi-stage high-energy physics workflows with agentic large language models. It introduces HEPTAPOD, a framework that integrates LLMs with schema-validated tools, a line-delimited event format (evtjsonl), and run-card–driven orchestration to enable transparent, human-in-the-loop planning and execution. The authors demonstrate a representative BSM leptoquark Monte Carlo validation pipeline—spanning model generation, event generation, showering, and analysis—to show reproducibility, provenance, and robust recovery across stages. The results illustrate how an agent-guided, auditable workflow can coordinate heterogeneous software (FeynRules, MadGraph, Pythia, jet clustering, and resonance reconstruction) while preserving human oversight, with clear paths for future expansion to more domains, automated configuration synthesis, and higher degrees of autonomy.
Abstract
Many workflows in high-energy-physics (HEP) stand to benefit from recent advances in transformer-based large language models (LLMs). While early applications of LLMs focused on text generation and code completion, modern LLMs now support orchestrated agency: the coordinated execution of complex, multi-step tasks through tool use, structured context, and iterative reasoning. We introduce the HEP Toolkit for Agentic Planning, Orchestration, and Deployment (HEPTAPOD), an orchestration framework designed to bring this emerging paradigm to HEP pipelines. The framework enables LLMs to interface with domain-specific tools, construct and manage simulation workflows, and assist in common utility and data analysis tasks through schema-validated operations and run-card-driven configuration. To demonstrate these capabilities, we consider a representative Beyond the Standard Model (BSM) Monte Carlo validation pipeline that spans model generation, event simulation, and downstream analysis within a unified, reproducible workflow. HEPTAPOD provides a structured and auditable layer between human researchers, LLMs, and computational infrastructure, establishing a foundation for transparent, human-in-the-loop systems.
