Table of Contents
Fetching ...

El Agente Gráfico: Structured Execution Graphs for Scientific Agents

Jiaru Bai, Abdulrahman Aldossary, Thomas Swanick, Marcel Müller, Yeonghun Kang, Zijian Zhang, Jin Won Lee, Tsz Wai Ko, Mohammad Ghazi Vakili, Varinia Bernales, Alán Aspuru-Guzik

TL;DR

This work presents El Agente Grafico, a single-agent framework that embeds LLM-driven decision-making within a type-safe execution environment and dynamic knowledge graphs for external persistence and extends this paradigm to two other large classes of applications: conformer ensemble generation and metal-organic framework design.

Abstract

Large language models (LLMs) are increasingly used to automate scientific workflows, yet their integration with heterogeneous computational tools remains ad hoc and fragile. Current agentic approaches often rely on unstructured text to manage context and coordinate execution, generating often overwhelming volumes of information that may obscure decision provenance and hinder auditability. In this work, we present El Agente Gráfico, a single-agent framework that embeds LLM-driven decision-making within a type-safe execution environment and dynamic knowledge graphs for external persistence. Central to our approach is a structured abstraction of scientific concepts and an object-graph mapper that represents computational state as typed Python objects, stored either in memory or persisted in an external knowledge graph. This design enables context management through typed symbolic identifiers rather than raw text, thereby ensuring consistency, supporting provenance tracking, and enabling efficient tool orchestration. We evaluate the system by developing an automated benchmarking framework across a suite of university-level quantum chemistry tasks previously evaluated on a multi-agent system, demonstrating that a single agent, when coupled to a reliable execution engine, can robustly perform complex, multi-step, and parallel computations. We further extend this paradigm to two other large classes of applications: conformer ensemble generation and metal-organic framework design, where knowledge graphs serve as both memory and reasoning substrates. Together, these results illustrate how abstraction and type safety can provide a scalable foundation for agentic scientific automation beyond prompt-centric designs.

El Agente Gráfico: Structured Execution Graphs for Scientific Agents

TL;DR

This work presents El Agente Grafico, a single-agent framework that embeds LLM-driven decision-making within a type-safe execution environment and dynamic knowledge graphs for external persistence and extends this paradigm to two other large classes of applications: conformer ensemble generation and metal-organic framework design.

Abstract

Large language models (LLMs) are increasingly used to automate scientific workflows, yet their integration with heterogeneous computational tools remains ad hoc and fragile. Current agentic approaches often rely on unstructured text to manage context and coordinate execution, generating often overwhelming volumes of information that may obscure decision provenance and hinder auditability. In this work, we present El Agente Gráfico, a single-agent framework that embeds LLM-driven decision-making within a type-safe execution environment and dynamic knowledge graphs for external persistence. Central to our approach is a structured abstraction of scientific concepts and an object-graph mapper that represents computational state as typed Python objects, stored either in memory or persisted in an external knowledge graph. This design enables context management through typed symbolic identifiers rather than raw text, thereby ensuring consistency, supporting provenance tracking, and enabling efficient tool orchestration. We evaluate the system by developing an automated benchmarking framework across a suite of university-level quantum chemistry tasks previously evaluated on a multi-agent system, demonstrating that a single agent, when coupled to a reliable execution engine, can robustly perform complex, multi-step, and parallel computations. We further extend this paradigm to two other large classes of applications: conformer ensemble generation and metal-organic framework design, where knowledge graphs serve as both memory and reasoning substrates. Together, these results illustrate how abstraction and type safety can provide a scalable foundation for agentic scientific automation beyond prompt-centric designs.
Paper Structure (68 sections, 17 figures, 6 tables)

This paper contains 68 sections, 17 figures, 6 tables.

Figures (17)

  • Figure 1: High-level illustration showing the main components of El Agente Gráfico: (i) GraphChat as the user interface, displaying real-time events such as geometry optimization; (ii) the object graph mapper (a customized version of Ref. Bai2025twa) is used to (de)serialize Python objects into a knowledge graph; (iii) execution graphs for the main workflows with router agents; and (iv) additional tools, including web search, sandboxed code execution, and knowledge graph interaction. The GraphChat user interface can be viewed from the Supplementary Video https://www.youtube.com/playlist?list=PLaUD8plXw_ecR7A1EwVAKL3pzIZqjvuVU.
  • Figure 2: Overall structure of El Agente Gráfico: (a) Users can interact with Gráfico through prompts or by uploading xyz files, while Gráfico provides real-time updates of calculations and molecular trajectories; (b) Typed execution graph for GPU4PySCF showing the execution nodes with multiple admissible states that are controlled by the router agent; (c) Generated runtime (typed) states are kept in the knowledge graph for later retrieval and allow real-time user inspection; (d) Routing agent allows for schema-compliant decision making while instantiating typed arguments to the next node.
  • Figure 3: Generation of Boltzmann-weighted absorption spectra using tddft, showing Gráfico generating (solvated) conformers and passing their iri to (GPU4)PySCF. Absorption spectra plots were generated with Gráfico. In the top plot, the geometry was provided in the prompt (not shown here). Prompts were summarized for brevity; complete prompts and transcripts are provided in Supplementary Sec. \ref{['si:crest_pyscf']}.
  • Figure 4: The mof workflow includes: (1) structure acquisition by CCDC refcode Moghadam2017CCDC from CoRE-MOF database Zhao2025coremof, (2) semantic decomposition of CIF files into topology, metal nodes, and organic linkers, (3) combinatorial search within kg to propose new hypothetical mof, (4) mof construction with PORMAKE Lee2021pormake, (5) geometry optimization using GPU-accelerated mlip, and (6) porosity analysis using Zeo++ Willems2012zeopp. These tools allow Gráfico to process, propose and analyze new mof, as well as perform the necessary queries for both in-memory and external graph. Prompts were summarized for brevity; complete prompts and transcripts are provided in Supplementary Sec. \ref{['si:mof_exploration']}.
  • Figure 5: Current stage and roadmap of future work showing: (1) structured execution of typed execution graphs and tools (as implemented in the current Gráfico system); (2) asynchronous and resource-aware execution to enable proactive agents; (3) semantic boundary evolution allowing rapid extension of tools and ontologies; (4) long-horizon agents towards a distributed network of AI scientists.
  • ...and 12 more figures