ATHENA: Agentic Team for Hierarchical Evolutionary Numerical Algorithms
Juan Diego Toscano, Daniel T. Chen, George Em Karniadakis
TL;DR
This work presents ATHENA, an agentic lab that unifies Scientific Computing and Scientific Machine Learning through the HENA loop, recasting research as a Contextual Bandit to achieve sample-efficient discovery. Conceptual Scaffolding constrainsthe action space to expert blueprints, enabling robust, verifiable methodological evolution via Agentic Teams and a proposer-critic policy. ATHENA demonstrates deep physical reasoning, automatic discovery of exact or highly accurate solvers, and effective human-in-the-loop interventions, including hybrid PINN–FEM workflows for multiphysics problems. The framework achieves state-of-the-art accuracy (e.g., $4.76\times 10^{-14}$ MSE in viscous Burgers) and exhibits strong collaborative capabilities, signaling a paradigm shift toward autonomous laboratories that accelerate scientific discovery while preserving mathematical rigor.
Abstract
Bridging the gap between theoretical conceptualization and computational implementation is a major bottleneck in Scientific Computing (SciC) and Scientific Machine Learning (SciML). We introduce ATHENA (Agentic Team for Hierarchical Evolutionary Numerical Algorithms), an agentic framework designed as an Autonomous Lab to manage the end-to-end computational research lifecycle. Its core is the HENA loop, a knowledge-driven diagnostic process framed as a Contextual Bandit problem. Acting as an online learner, the system analyzes prior trials to select structural `actions' ($A_n$) from combinatorial spaces guided by expert blueprints (e.g., Universal Approximation, Physics-Informed constraints). These actions are translated into executable code ($S_n$) to generate scientific rewards ($R_n$). ATHENA transcends standard automation: in SciC, it autonomously identifies mathematical symmetries for exact analytical solutions or derives stable numerical solvers where foundation models fail. In SciML, it performs deep diagnosis to tackle ill-posed formulations and combines hybrid symbolic-numeric workflows (e.g., coupling PINNs with FEM) to resolve multiphysics problems. The framework achieves super-human performance, reaching validation errors of $10^{-14}$. Furthermore, collaborative ``human-in-the-loop" intervention allows the system to bridge stability gaps, improving results by an order of magnitude. This paradigm shift focuses from implementation mechanics to methodological innovation, accelerating scientific discovery.
