Table of Contents
Fetching ...

Grounding LLMs in Scientific Discovery via Embodied Actions

Bo Zhang, Jinfeng Zhou, Yuxuan Chen, Jianing Yin, Minlie Huang, Hongning Wang

TL;DR

EmbodiedAct is proposed, a framework that transforms established scientific software into active embodied agents by grounding LLMs in embodied actions with a tight perception-execution loop and achieving SOTA performance by ensuring satisfactory reliability and stability in long-horizon simulations and enhanced accuracy in scientific modeling.

Abstract

Large Language Models (LLMs) have shown significant potential in scientific discovery but struggle to bridge the gap between theoretical reasoning and verifiable physical simulation. Existing solutions operate in a passive "execute-then-response" loop and thus lacks runtime perception, obscuring agents to transient anomalies (e.g., numerical instability or diverging oscillations). To address this limitation, we propose EmbodiedAct, a framework that transforms established scientific software into active embodied agents by grounding LLMs in embodied actions with a tight perception-execution loop. We instantiate EmbodiedAct within MATLAB and evaluate it on complex engineering design and scientific modeling tasks. Extensive experiments show that EmbodiedAct significantly outperforms existing baselines, achieving SOTA performance by ensuring satisfactory reliability and stability in long-horizon simulations and enhanced accuracy in scientific modeling.

Grounding LLMs in Scientific Discovery via Embodied Actions

TL;DR

EmbodiedAct is proposed, a framework that transforms established scientific software into active embodied agents by grounding LLMs in embodied actions with a tight perception-execution loop and achieving SOTA performance by ensuring satisfactory reliability and stability in long-horizon simulations and enhanced accuracy in scientific modeling.

Abstract

Large Language Models (LLMs) have shown significant potential in scientific discovery but struggle to bridge the gap between theoretical reasoning and verifiable physical simulation. Existing solutions operate in a passive "execute-then-response" loop and thus lacks runtime perception, obscuring agents to transient anomalies (e.g., numerical instability or diverging oscillations). To address this limitation, we propose EmbodiedAct, a framework that transforms established scientific software into active embodied agents by grounding LLMs in embodied actions with a tight perception-execution loop. We instantiate EmbodiedAct within MATLAB and evaluate it on complex engineering design and scientific modeling tasks. Extensive experiments show that EmbodiedAct significantly outperforms existing baselines, achieving SOTA performance by ensuring satisfactory reliability and stability in long-horizon simulations and enhanced accuracy in scientific modeling.
Paper Structure (45 sections, 7 equations, 11 figures, 12 tables)

This paper contains 45 sections, 7 equations, 11 figures, 12 tables.

Figures (11)

  • Figure 1: Comparison of EmbodiedAct with existing paradigms. EmbodiedAct integrates executable simulation primitives and continuous runtime perception, endowing the agent with the capacity for embodied action within physical simulation environments.
  • Figure 2: Overview of EmbodiedAct, which bridges the LLM agent and simulation environment via the Asynchronous State Sync Protocol. EmbodiedAct orchestrates a fast inner loop driven by the Runtime Perception Engine to trigger immediate Hot-Fixes, and a slow outer loop driven by the Reflective Decision Maker to guide Re-planning.
  • Figure 3: Results (%) of the average pass rate (Pass@3) on the EngDesign benchmark. A shorter error bar indicates greater reliability.
  • Figure 4: Analysis of Stability. The consistency of the minimum ($x$-axis) versus maximum ($y$-axis) scores (on the EngDesign, Extended set) across three independent runs. Ideally, results cluster along the diagonal ($y=x$), indicating perfect stability.
  • Figure 5: Case Study: PID Controller Design for Magnetic Levitation.(Left) Traditional static methods (e.g., Ziegler-Nichols tuning on FOPDT approximation). (Right) EmbodiedAct succeeds via a dual-loop cognitive architecture: the Outer Loop (Top Right) performs structural replanning to resolve opacity, while the Inner Loop (Bottom Right) executes physics-informed parameter tuning.
  • ...and 6 more figures