Table of Contents
Fetching ...

EWE: An Agentic Framework for Extreme Weather Analysis

Zhe Jiang, Jiong Wang, Xiaoyu Yue, Zijie Guo, Wenlong Zhang, Fenghua Ling, Wanli Ouyang, Lei Bai

TL;DR

The paper tackles the bottleneck of diagnosing extreme-weather mechanisms by introducing EWE, the first autonomous agent designed for end-to-end diagnostic analysis of extreme events. EWE combines knowledge-guided planning with Chain-of-Thought prompts, self-evolving closed-loop reasoning, and a Meteorological Toolkit to retrieve data, construct Python-based visualizations, and interpret multimodal outputs via a multimodal LLM. A 103-event benchmark and a novel stepwise evaluation framework assess performance across code, visualizations, and physical diagnostics, with GPT-4.1 serving as the evaluator in a dual-protocol setup. The results show clear model specialization, the necessity of integrated analysis tools and auditors, and the potential for automated scientific discovery and democratized expertise in regions vulnerable to extreme weather, with future work toward real-time data integration and broader event coverage.

Abstract

Extreme weather events pose escalating risks to global society, underscoring the urgent need to unravel their underlying physical mechanisms. Yet the prevailing expert-driven, labor-intensive diagnostic paradigm has created a critical analytical bottleneck, stalling scientific progress. While AI for Earth Science has achieved notable advances in prediction, the equally essential challenge of automated diagnostic reasoning remains largely unexplored. We present the Extreme Weather Expert (EWE), the first intelligent agent framework dedicated to this task. EWE emulates expert workflows through knowledge-guided planning, closed-loop reasoning, and a domain-tailored meteorological toolkit. It autonomously produces and interprets multimodal visualizations from raw meteorological data, enabling comprehensive diagnostic analyses. To catalyze progress, we introduce the first benchmark for this emerging field, comprising a curated dataset of 103 high-impact events and a novel step-wise evaluation metric. EWE marks a step toward automated scientific discovery and offers the potential to democratize expertise and intellectual resources, particularly for developing nations vulnerable to extreme weather.

EWE: An Agentic Framework for Extreme Weather Analysis

TL;DR

The paper tackles the bottleneck of diagnosing extreme-weather mechanisms by introducing EWE, the first autonomous agent designed for end-to-end diagnostic analysis of extreme events. EWE combines knowledge-guided planning with Chain-of-Thought prompts, self-evolving closed-loop reasoning, and a Meteorological Toolkit to retrieve data, construct Python-based visualizations, and interpret multimodal outputs via a multimodal LLM. A 103-event benchmark and a novel stepwise evaluation framework assess performance across code, visualizations, and physical diagnostics, with GPT-4.1 serving as the evaluator in a dual-protocol setup. The results show clear model specialization, the necessity of integrated analysis tools and auditors, and the potential for automated scientific discovery and democratized expertise in regions vulnerable to extreme weather, with future work toward real-time data integration and broader event coverage.

Abstract

Extreme weather events pose escalating risks to global society, underscoring the urgent need to unravel their underlying physical mechanisms. Yet the prevailing expert-driven, labor-intensive diagnostic paradigm has created a critical analytical bottleneck, stalling scientific progress. While AI for Earth Science has achieved notable advances in prediction, the equally essential challenge of automated diagnostic reasoning remains largely unexplored. We present the Extreme Weather Expert (EWE), the first intelligent agent framework dedicated to this task. EWE emulates expert workflows through knowledge-guided planning, closed-loop reasoning, and a domain-tailored meteorological toolkit. It autonomously produces and interprets multimodal visualizations from raw meteorological data, enabling comprehensive diagnostic analyses. To catalyze progress, we introduce the first benchmark for this emerging field, comprising a curated dataset of 103 high-impact events and a novel step-wise evaluation metric. EWE marks a step toward automated scientific discovery and offers the potential to democratize expertise and intellectual resources, particularly for developing nations vulnerable to extreme weather.

Paper Structure

This paper contains 13 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: EWE identifies the driving factors of extreme events through a human-like reasoning process by progressively retrieving data and use physics-based diagnostic toolkit.
  • Figure 2: Overview of EWE framework. User request starts self-evolving closed-loop reasoning, and the framework ends with multi-faceted analytical report.
  • Figure 3: For all extreme event types, human-drafted checklists are refined by LLM. Then analytical reports generated by EWE are assessed by corresponding refined criteria.
  • Figure 4: Statistics of extreme events in the test set: the left panel shows the distribution of event categories, and the right for the geographical distribution.
  • Figure 5: Workflow example of extreme precipitation event analysis. Each step, from top to bottom, presents the agent's thought, action, observation, and interpretation.
  • ...and 1 more figures