Table of Contents
Fetching ...

Agentic Exploration of Physics Models

Maximilian Nägele, Florian Marquardt

TL;DR

SciExplorer demonstrates that a generalist LLM agent, empowered by tool use and external memory, can autonomously drive the heuristic loop of scientific discovery across mechanical, wave, and quantum domains without task-specific tuning. By integrating a minimal system prompt with flexible Python-code execution, plotting, and domain simulators, the agent designs experiments, analyzes results, and forms hypotheses to recover governing equations and Hamiltonians with high fidelity. Across mechanical dynamics, field/wave evolution, and quantum spin models, the approach yields near-perfect fits and robust out-of-sample validations, suggesting strong potential for domain-general scientific exploration and rapid hypothesis testing in experimental contexts. The work points to broad applicability beyond physics, enabling automated discovery, phase-diagram mapping, and control optimization in unknown or partially known systems.

Abstract

The process of scientific discovery relies on an interplay of observations, analysis, and hypothesis generation. Machine learning is increasingly being adopted to address individual aspects of this process. However, it remains an open challenge to fully automate the heuristic, iterative loop required to discover the laws of an unknown system by exploring it through experiments and analysis, without tailoring the approach to the specifics of a given task. Here, we introduce SciExplorer, an agent that leverages large language model tool-use capabilities to enable exploration of systems without any domain-specific blueprints, and apply it to physical systems that are initially unknown to the agent. We test SciExplorer on a broad set of models spanning mechanical dynamical systems, wave evolution, and quantum many-body physics. Despite using a minimal set of tools, primarily based on code execution, we observe impressive performance on tasks such as recovering equations of motion from observed dynamics and inferring Hamiltonians from expectation values. The demonstrated effectiveness of this setup opens the door towards similar scientific exploration in other domains, without the need for finetuning or task-specific instructions.

Agentic Exploration of Physics Models

TL;DR

SciExplorer demonstrates that a generalist LLM agent, empowered by tool use and external memory, can autonomously drive the heuristic loop of scientific discovery across mechanical, wave, and quantum domains without task-specific tuning. By integrating a minimal system prompt with flexible Python-code execution, plotting, and domain simulators, the agent designs experiments, analyzes results, and forms hypotheses to recover governing equations and Hamiltonians with high fidelity. Across mechanical dynamics, field/wave evolution, and quantum spin models, the approach yields near-perfect fits and robust out-of-sample validations, suggesting strong potential for domain-general scientific exploration and rapid hypothesis testing in experimental contexts. The work points to broad applicability beyond physics, enabling automated discovery, phase-diagram mapping, and control optimization in unknown or partially known systems.

Abstract

The process of scientific discovery relies on an interplay of observations, analysis, and hypothesis generation. Machine learning is increasingly being adopted to address individual aspects of this process. However, it remains an open challenge to fully automate the heuristic, iterative loop required to discover the laws of an unknown system by exploring it through experiments and analysis, without tailoring the approach to the specifics of a given task. Here, we introduce SciExplorer, an agent that leverages large language model tool-use capabilities to enable exploration of systems without any domain-specific blueprints, and apply it to physical systems that are initially unknown to the agent. We test SciExplorer on a broad set of models spanning mechanical dynamical systems, wave evolution, and quantum many-body physics. Despite using a minimal set of tools, primarily based on code execution, we observe impressive performance on tasks such as recovering equations of motion from observed dynamics and inferring Hamiltonians from expectation values. The demonstrated effectiveness of this setup opens the door towards similar scientific exploration in other domains, without the need for finetuning or task-specific instructions.

Paper Structure

This paper contains 20 sections, 29 equations, 29 figures, 3 tables.

Figures (29)

  • Figure 1: Agentic SciExplorer. The explorer is given a general task, which is then applied to a specific physics scenario. During the scientific exploration cycle, LLM-based reasoning is employed to select a (numerical) experiment to be carried out, calling the appropriate tool. The resulting observations are analyzed using one or several calls to analysis tools, often involving both the coding and the multimodal capabilities of the LLM. This cycle repeats until the LLM agent decides it has acquired sufficient information. Finally, a summary of the reasoning chain is produced for the human user. For benchmarking, the agent is asked to provide a testable final answer, e.g. in the form of code that can be run and evaluated.
  • Figure 2: Example exploration. The agent is tasked with discovering the unknown equations of motion of a system by observing and analyzing its dynamics. In this figure, we sketch in a compressed fashion some of the steps in the resulting extended autonomous exploration. The agent first runs several experiments with varying initial conditions. Then, it infers the qualitative model (here, a particle in some attractive potential) from visualizations. It subsequently employs fitting routines to determine the center of attraction by numerically maximizing the radial acceleration and determines the coefficients of a gravity model through sparse linear regression of the acceleration. Finally, it computes diagnostics, such as the energy and angular momentum, to validate the proposed model before saving the final result as a Python function representing the equation of motion. The sequence of steps depends entirely on the choices of the agent, which are adapted to the given problem, reacting to its own observations and conclusions.
  • Figure 3: Mechanical Systems. The agent discovers the equations of motion of black-box physical systems by specifying initial conditions, observing dynamics, and analyzing the results. We consider generic dynamical systems with one to three generalized coordinates, particles moving in 2d, and systems where an observable particle interacts with an unknown number of hidden degrees of freedom (dof). We run 5 independent attempts per system and show the mean coefficient of determination $R^2$ of the agent's proposed model with the true system with 95% bootstrap confidence intervals. The agent can recover the true model in a large subset of systems. Stars indicate the conversation the agent considers best when asked to rank all attempts.
  • Figure 4: Waves and fields.a In its exploration of waves and fields, the agent can select the initial field configuration for each experimental run. It can also define and simulate field equations for comparison. b Evolution of the absolute square $|\phi|^2$ of a Gaussian wave packet for the true model and the agent's discovered model (announced by the agent at the end of the exploration). The true model on the left is a linear Schrödinger equation with confining potential (we do not show the most accurate model discovered by the agent in its multiple runs). On the right, the true model is a complex Ginzburg-Landau equation with next-nearest-neighbor hopping on a tight-binding lattice. The $R^2$ value is calculated between the evolution equations for the true and the predicted model, for multiple reasonable initial conditions (see Methods for details). c Statistics for various scenarios. We run six independent explorations, and the agent can recover the true model ($R^2\approx 1$) in all but one scenario in at least one attempt.
  • Figure 5: Quantum Many-Body Physics.a Discovering the Hamiltonian of a system of multiple spin-1/2 particles, based on observing the dynamics for initial conditions selected by the agent. Each experimental run produces the evolution of the single-spin expectation values $\langle {\hat{M}}(t) \rangle$ (with ${\hat{M}}={\hat{\sigma}}^x_j$, ${\hat{\sigma}}^y_j$, and ${\hat{\sigma}^z}_j$), for an initial (product) state selected by the agent. In some experiments, we ask the agent to discover a whole class of Hamiltonians by allowing it to control experimental parameters, whose meaning it is not aware of ('A' and 'B'). The bottom left shows an example – the complex dynamics of two spins of a Heisenberg model. Right: Performance for various scenarios. Normalized scalar product between the true Hamiltonian and the Hamiltonian proposed by the agent (1 means perfect match, for details see Methods). We consider the following systems: ARB: Arbitrarily chosen Hamiltonian acting on 3 spins, HEIS: 1d Heisenberg model with 10 spins (in two of the attempts, the agent actually discovered the correct model but made a formal error when announcing the final Hamiltonian; we counted them as overlap $-1$), TFI: 1d transverse field Ising model with 10 spins. The letters in brackets denote whether the agent can vary the value of a field parameter (f), of a coupling parameter (c), or observe only two spins of a larger chain (h). b Hamiltonian discovery by measuring the ground state expectation values of spin operators $\hat{M}$ defined by the agent. Some examples are shown on the bottom left. Right: Fidelity scaled by the number of spins $N$ between the ground state of the Hamiltonian predicted by the agent after the exploration and the true ground state. We consider the following systems: TFI: 1d transverse field Ising model with 10 spins, HEIS: 2d Heisenberg model with 9 spins, TI: 1d topological Ising model with three-body interactions and 10 spins, ARB: Arbitrarily chosen Hamiltonian acting on 3 spins. Letters denote the same as in part a. Additionally, the agent can sometimes vary the number of spins in the chain, as denoted by (n).
  • ...and 24 more figures