A Self-Evolving AI Agent System for Climate Science

Zijie Guo; Jiong Wang; Fenghua Ling; Wangxu Wei; Xiaoyu Yue; Zhe Jiang; Wanghan Xu; Jing-Jia Luo; Lijing Cheng; Yoo-Geun Ham; Fengfei Song; Pierre Gentine; Toshio Yamagata; Ben Fei; Wenlong Zhang; Xinyu Gu; Chao Li; Yaqiang Wang; Tao Chen; Wanli Ouyang; Bowen Zhou; Lei Bai

A Self-Evolving AI Agent System for Climate Science

Zijie Guo, Jiong Wang, Fenghua Ling, Wangxu Wei, Xiaoyu Yue, Zhe Jiang, Wanghan Xu, Jing-Jia Luo, Lijing Cheng, Yoo-Geun Ham, Fengfei Song, Pierre Gentine, Toshio Yamagata, Ben Fei, Wenlong Zhang, Xinyu Gu, Chao Li, Yaqiang Wang, Tao Chen, Wanli Ouyang, Bowen Zhou, Lei Bai

TL;DR

EarthLink presents a self-evolving AI system designed as a copilot for climate science, integrating planning, code generation, data analysis, and physical reasoning to address data fragmentation across Earth system domains. Its architecture comprises Planning, Scientific Diagnosis, and Multi-Scenario Analysis modules, underpinned by Knowledge, Data, and Tool Libraries, with GPT‑5 as the foundation model and per-module parameter customization. Through extensive evaluations on 36 climate tasks and a fully open-ended Atlantic Niño discovery scenario, EarthLink demonstrates near‑junior-researcher proficiency, autonomous hypothesis generation, and iterative self-improvement, while highlighting limitations and the need for transparent human oversight. The framework envisions a new human–AI research paradigm that can compress scientific discovery timelines, empower broader participation, and produce transparent, reproducible workflows; the work provides a public platform and open tooling to accelerate Earth sciences research. Score_overall is conceptually defined as the mean regional reliability across tasks, $Score_{overall} = \frac{1}{M}\sum_{m=1}^M S_m$, where each $S_m$ denotes a per-region performance metric, enabling quantitative cross-domain comparisons of AI-assisted climate science.

Abstract

Scientific progress in Earth science depends on integrating data across the planet's interconnected spheres. However, the accelerating volume and fragmentation of multi-sphere knowledge and data have surpassed human analytical capacity. This creates a major bottleneck for discovery, especially in climate science. To address this challenge, we introduce EarthLink, the first self-evolving AI agent system designed as an interactive "copilot" for Earth scientists. Through natural language interaction, EarthLink automates the entire research workflow by integrating planning, code execution, data analysis, and physical reasoning into a unified process that directly addresses this limitation. Beyond efficiency, it exhibits human-like cross-disciplinary analytical ability and achieves proficiency comparable to a junior researcher in expert evaluations on core large-scale climate tasks, including model-observation comparison and climate change understanding. When tasked with an open scientific problem, specifically the discovery of precursors of the Atlantic Niño, EarthLink autonomously developed a research strategy, identified sources of predictability, verified its hypotheses with available data, and proposed a physically consistent mechanism. These emerging capabilities enable a new human-AI research paradigm. Scientists can focus on value and result judgments, while AI systems handle complex data analysis and knowledge integration. This accelerates the pace and breadth of discovery in Earth sciences. The system is accessible at our website https://earthlink.intern-ai.org.cn.

A Self-Evolving AI Agent System for Climate Science

TL;DR

Abstract

A Self-Evolving AI Agent System for Climate Science

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (16)