Table of Contents
Fetching ...

Agentic Diagrammatica: Towards Autonomous Symbolic Computation in High Energy Physics

Tony Menzo, Alexander Roman, George T. Fleming, Sergei Gleyzer, Konstantin T. Matchev, Stephen Mrenna

Abstract

We present Diagrammatica, a symbolic computation extension to the HEPTAPOD agentic framework, which enables LLM agents to plan and execute multi-step theoretical calculations. Symbolic computation poses a distinctive reliability challenge for LLM agents, as correctness is governed by implicit mathematical conventions that are not encoded in a form that can be easily checked in the computational backend. We identify two complementary remedies, tool-constrained computation and targeted knowledge grounding, and pursue the first as the primary architecture. Concretely, we concentrate the agent's action distribution onto tool calls with convention-fixing semantics, in which the agent specifies a compact, human-auditable diagram specification and a trusted backend performs the symbolic or numerical manipulations exactly. The toolkit provides two complementary calculation paths consuming a shared diagram specification: Naive Dimensional Analysis (NDA) for order-of-magnitude rate estimates and Exact Diagrammatic Analysis (EDA) for tree-level symbolic calculations via automatic FeynCalc code generation, both supplemented by automatic Feynman diagram enumeration and a navigable theory knowledge base. The architecture is validated on two benchmarks: (1) an exhaustive catalog of all tree-level, single-vertex $1\to 2$ partial decay widths across scalar, fermion, and vector parents, with complete massless and threshold limits and Standard Model validation; and (2) an NDA sensitivity study of the muon decay multiplicity $μ^+ \to ν_μ\barν_e + n(e^+e^-) + e^-$, determining the maximum observable $n$ at current and planned muon experiments.

Agentic Diagrammatica: Towards Autonomous Symbolic Computation in High Energy Physics

Abstract

We present Diagrammatica, a symbolic computation extension to the HEPTAPOD agentic framework, which enables LLM agents to plan and execute multi-step theoretical calculations. Symbolic computation poses a distinctive reliability challenge for LLM agents, as correctness is governed by implicit mathematical conventions that are not encoded in a form that can be easily checked in the computational backend. We identify two complementary remedies, tool-constrained computation and targeted knowledge grounding, and pursue the first as the primary architecture. Concretely, we concentrate the agent's action distribution onto tool calls with convention-fixing semantics, in which the agent specifies a compact, human-auditable diagram specification and a trusted backend performs the symbolic or numerical manipulations exactly. The toolkit provides two complementary calculation paths consuming a shared diagram specification: Naive Dimensional Analysis (NDA) for order-of-magnitude rate estimates and Exact Diagrammatic Analysis (EDA) for tree-level symbolic calculations via automatic FeynCalc code generation, both supplemented by automatic Feynman diagram enumeration and a navigable theory knowledge base. The architecture is validated on two benchmarks: (1) an exhaustive catalog of all tree-level, single-vertex partial decay widths across scalar, fermion, and vector parents, with complete massless and threshold limits and Standard Model validation; and (2) an NDA sensitivity study of the muon decay multiplicity , determining the maximum observable at current and planned muon experiments.

Paper Structure

This paper contains 63 sections, 54 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Schematic depiction of each action uncertainty (entropy) component as a function of context length.
  • Figure 2: Schematic summary flowchart of a common four-phase solution mode for the task 1 benchmark. A full transcript can be seen in appendix \ref{['app:task1_transcript']}.
  • Figure 5: Branching ratio of $\mu^+ \to \bar{\nu}_\mu \nu_e\, e^+ + n(e^+e^-)$ as a function of pair multiplicity $n$. NDA estimates (red circles), MadGraph exact results (blue squares), and the SINDRUM measurement (green diamond) are shown alongside experimental sensitivity thresholds. The $n=4$ point (orange triangle) is extrapolated. Shaded bands indicate NDA uncertainty. Figure generated by the agent with minor cosmetic adjustments.
  • Figure 6: Diagram enumeration metadata. Left: dominant-class (single-$W$) count compared to total, with percentages. Right: fractional composition by heavy-propagator class. Figure generated by the agent with minor cosmetic adjustments.