Table of Contents
Fetching ...

TRACE: A Multi-Agent System for Autonomous Physical Reasoning for Seismology

Feng Liu, Jian Xu, Xin Cui, Xinghao Wang, Zijie Guo, Jiong Wang, S. Mostafa Mousavi, Xinyu Gu, Hao Chen, Ben Fei, Lihua Fang, Fenghua Ling, Zefeng Li, Lei Bai

Abstract

Inferring physical mechanisms that govern earthquake sequences from geophysical observations remains a challenging task, particularly across tectonically distinct environments where similar seismic patterns can reflect different underlying processes. Current seismological processing and interpretation rely heavily on experts' choice of parameters and the synthesis of various seismological products, limiting reproducibility and the formation of generalizable knowledge across settings. Here we present TRACE (Trans-perspective Reasoning and Automated Comprehensive Evaluator), a multi-agent system that combines large language model planning with formal seismological constraints to derive auditable, physically grounded mechanistic inferences from raw observations. Applied to the 2019 Ridgecrest sequence, TRACE autonomously identifies stress-perturbation-induced delayed triggering, resolving the cascading interaction between the Mw 6.4 and Mw 7.1 mainshocks. For the 2025 Santorini-Kolumbo volcanic eruption, the system identifies a structurally guided intrusion model, distinguishing episodic migration via fault channels from the continuous propagation expected in homogeneous crustal failure. By providing a generalizable infrastructure for deriving physical insights from seismic phenomena, TRACE advances the field from expert-dependent analysis toward knowledge-guided autonomous discovery in Earth sciences.

TRACE: A Multi-Agent System for Autonomous Physical Reasoning for Seismology

Abstract

Inferring physical mechanisms that govern earthquake sequences from geophysical observations remains a challenging task, particularly across tectonically distinct environments where similar seismic patterns can reflect different underlying processes. Current seismological processing and interpretation rely heavily on experts' choice of parameters and the synthesis of various seismological products, limiting reproducibility and the formation of generalizable knowledge across settings. Here we present TRACE (Trans-perspective Reasoning and Automated Comprehensive Evaluator), a multi-agent system that combines large language model planning with formal seismological constraints to derive auditable, physically grounded mechanistic inferences from raw observations. Applied to the 2019 Ridgecrest sequence, TRACE autonomously identifies stress-perturbation-induced delayed triggering, resolving the cascading interaction between the Mw 6.4 and Mw 7.1 mainshocks. For the 2025 Santorini-Kolumbo volcanic eruption, the system identifies a structurally guided intrusion model, distinguishing episodic migration via fault channels from the continuous propagation expected in homogeneous crustal failure. By providing a generalizable infrastructure for deriving physical insights from seismic phenomena, TRACE advances the field from expert-dependent analysis toward knowledge-guided autonomous discovery in Earth sciences.
Paper Structure (52 sections, 6 equations, 5 figures, 2 tables)

This paper contains 52 sections, 6 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: The TRACE multi-agent framework for autonomous seismic discovery and end-to-end scientific reasoning.a, Schematic architecture of the TRACE autonomous reasoning system. The workflow translates an open-ended scientific request, augmented by a domain-specific knowledge library, into physically constrained analytical strategies. A specialized Planning Agent decomposes the request into structured protocols, which are overseen by human supervision to ensure scientific alignment. The Workflow Agent then instantiates these plans into executable task sequences. Subsequently, a Coding Agent generates modular scripts by linking a suite of seismological algorithms and statistical inference libraries. A Result Checking Agent performs real-time diagnostics to verify physical consistency, while the Analysis & Summary Agent synthesizes multidimensional evidence into mechanistic scientific reports. b, Tool-driven long-chain seismology processing. This module demonstrates the automated execution of a complete seismic pipeline, comprising waveform pre-processing, event detection, phase picking, association, high-precision hypocenter relocation, and the construction of a final earthquake catalog. c, Multi-scale statistical analysis and cross-regional application. The framework exhibits scalability across diverse tectonic settings, illustrated here by high-resolution seismicity catalogs for the Ridgecrest shear zone, the Santorini-Kolumbo volcanic system, and global-scale seismic monitoring. d, Mechanistic reasoning across multiple perspectives. TRACE integrates statistical modeling with spatiotemporal evolution analysis to perform causal inference. By evaluating magnitude-frequency distributions and transformed magnitude cumulative distributions, the system bridges raw observational data with physically constrained interpretations of tectonic processes.
  • Figure 2: High-resolution earthquake catalog construction using the multi-agent system TRACE. TRACE generates a high-resolution earthquake catalog for the 2019 Ridgecrest seismic sequence within the study area ($35.25^{\circ}$–$36.25^{\circ}$ N, $118.0^{\circ}$–$117.0^{\circ}$ W). The workflow integrates six sequential processing stages, comprising (1) continuous waveform acquisition from regional networks such as CI, GS, and PB; (2) waveform preprocessing; (3) deep-learning-based earthquake detection combined with $P$ and $S$ phase picking; (4) phase association; (5) initial hypocenter location and magnitude estimation; and (6) high-precision relocation to resolve fine-scale seismogenic structures. The resulting catalog provides the basis for analyzing the spatiotemporal evolution and stress interactions between the $M_{\mathrm{w}}$ 6.4 and $M_{\mathrm{w}}$ 7.1 mainshocks.
  • Figure 3: Spatiotemporal organization of seismicity between the Ridgecrest mainshocks. High-resolution analysis of the TRACE-derived earthquake catalog reveals the progressive and structurally controlled evolution of seismicity bridging the $M_{\mathrm{w}}$ 6.4 and $M_{\mathrm{w}}$ 7.1 events. a, Spatiotemporal evolution of seismicity between the $M_{\mathrm{w}}$ 6.4 (blue star) and $M_{\mathrm{w}}$ 7.1 (red star) mainshocks, characterized by kernel density estimation (KDE) maps. b, Temporal evolution of seismicity orientations, revealing directional organization and shifts in dominant alignment consistent with orthogonal fault structures. c, Spatial delineation of Regions A and B and their respective seismicity rate evolution, highlighting contrasting temporal behaviors across fault segments. d, Spatial distribution of activation onset times relative to the $M_{\mathrm{w}}$ 6.4 event, indicating rapid activation along the SW–NE rupture zone followed by delayed expansion along the NW–SE fault system linking the two mainshocks. e, Spatial distribution of $b$-values, where localized low-$b$ anomalies emerge along the future rupture zone, suggesting relatively elevated stress levels. f, Omori–Utsu decay parameters ($p$-values) for Regions A and B, indicating rapid aftershock decay along the SW–NE rupture zone and delayed seismic acceleration along the NW–SE segment. Together, these observations indicate that the $M_{\mathrm{w}}$ 6.4 earthquake did not instantaneously trigger the $M_{\mathrm{w}}$ 7.1 rupture, but progressively organized a structurally controlled seismic corridor preceding the second mainshock.
  • Figure 4: Structural control and mechanistic decoupling of seismic migration during the 2025 Santorini–Kolumbo volcanic crisis.a, High-resolution earthquake catalog (colored circles, color-coded by depth) automatically constructed using TRACE from 15 seismic stations (yellow triangles). Grey lines indicate mapped regional faults, highlighting the alignment of seismicity with the NE–SW trending fault system. b, Geometry and centroid evolution of the earthquake cloud. The red line represents the principal axis (PCA) orientation. Colored circles show the centroid trajectory calculated in 6-hour moving windows, indicating a structure-guided migration path along the tectonic corridor. c, Spatiotemporal evolution of seismicity. Top: projection of earthquake hypocenters along the PCA axis as a function of time. Bottom: magnitude versus relative time. Both panels are color-coded by depth, showing episodic along-strike migration and systematic depth adjustments. d, Kinematic analysis of the migration. Top: centroid position along the PCA axis over time. Bottom: calculated migration rate (km h$^{-1}$). Migration occurs in episodic pulses rather than as a continuous steady advance. e, Temporal evolution of seismic statistics. Top: $b$-value calculated in 6-hour windows (red line indicates moving average). Bottom: frequency of events with $M \ge M_c$ (completeness magnitude). f, Spatial distribution of seismic characteristics. Top: $b$-value distribution (smoothed 6-h moving average) as a function of distance along the PCA axis. Bottom: seismic event density within a 5-km radius of the moving centroid. The spatial heterogeneity of $b$-values and the weak correspondence between the migration front and the largest events indicate a decoupling between structure-guided migration and localized mechanical failure.
  • Figure 5: Performance and distribution of the TRACE benchmark across task hierarchies. a, Distribution of tasks across two distinct complexity levels: atomic tasks (Level 1, green bars, $L1A$–$L1E$) and multi-step analytical tasks (Level 2, blue bars, $L2A$–$L2D$). Sub-levels correspond to representative task categories (Level 1: data retrieval, data formatting, signal processing, feature analysis, and scientific visualization; Level 2: sequential workflows, heuristic branching, batch processing, and parameter-space exploration). The vertical axis represents the absolute number of tasks categorized within each sub-level, with Level 1C ($n = 22$) constituting the largest task group. b, Evaluation of debugging efficiency across task levels for different large language models (LLMs). The vertical axis displays the average number of debug rounds required to achieve task completion. c, Average performance scores for human experts, GPT-5, Claude-4, and Gemini-3 across all task categories. Scores are normalized on a scale of 1 to 5, reflecting overall task performance across different levels of analytical complexity.