On the Opportunities of (Re)-Exploring Atmospheric Science by Foundation Models: A Case Study

Lujia Zhang; Hanzhe Cui; Yurong Song; Chenyue Li; Binhang Yuan; Mengqian Lu

On the Opportunities of (Re)-Exploring Atmospheric Science by Foundation Models: A Case Study

Lujia Zhang, Hanzhe Cui, Yurong Song, Chenyue Li, Binhang Yuan, Mengqian Lu

TL;DR

The paper investigates whether a state-of-the-art multimodal foundation model, GPT-4o, can address broad atmospheric-science tasks by evaluating its performance across four task classes: climate data processing, physical diagnosis, forecast/prediction, and adaptation/mitigation. It demonstrates strong capabilities in information extraction, numerical calculations, and classical analyses (e.g., EOF via PCA) with transparent, reproducible code outputs, but reveals substantial limitations in reliable short- to long-range forecasting (e.g., $24$–$96$ hour scales and ENSO predictions) when domain-specific modeling and papers are not embedded in the workflow. The study highlights the potential of GPT-4o to automate tedious routines and support reasoning, while also underscoring the need for domain-specific foundation models and human–AI collaboration to achieve robust, physics-grounded predictions. Overall, the work provides a realistic assessment of current FM capabilities in atmospheric science, guiding future research toward targeted model development, data-handling improvements, and prompt-engineering strategies that leverage domain knowledge.

Abstract

Most state-of-the-art AI applications in atmospheric science are based on classic deep learning approaches. However, such approaches cannot automatically integrate multiple complicated procedures to construct an intelligent agent, since each functionality is enabled by a separate model learned from independent climate datasets. The emergence of foundation models, especially multimodal foundation models, with their ability to process heterogeneous input data and execute complex tasks, offers a substantial opportunity to overcome this challenge. In this report, we want to explore a central question - how the state-of-the-art foundation model, i.e., GPT-4o, performs various atmospheric scientific tasks. Toward this end, we conduct a case study by categorizing the tasks into four main classes, including climate data processing, physical diagnosis, forecast and prediction, and adaptation and mitigation. For each task, we comprehensively evaluate the GPT-4o's performance along with a concrete discussion. We hope that this report may shed new light on future AI applications and research in atmospheric science.

On the Opportunities of (Re)-Exploring Atmospheric Science by Foundation Models: A Case Study

TL;DR

–

hour scales and ENSO predictions) when domain-specific modeling and papers are not embedded in the workflow. The study highlights the potential of GPT-4o to automate tedious routines and support reasoning, while also underscoring the need for domain-specific foundation models and human–AI collaboration to achieve robust, physics-grounded predictions. Overall, the work provides a realistic assessment of current FM capabilities in atmospheric science, guiding future research toward targeted model development, data-handling improvements, and prompt-engineering strategies that leverage domain knowledge.

Abstract

Paper Structure (19 sections, 3 equations, 12 figures, 1 table)

This paper contains 19 sections, 3 equations, 12 figures, 1 table.

Introduction
Climate Data Processing
Information Extraction
Statistical Calculation
Classical Algorithm
Physical Diagnosis
Extreme Weather Detection
Inference from Meteorological Variable Fields
Statistical Modeling
Forecast and Prediction
Short-term Temperature Prediction
Subseasonal-to-Seasonal Precipitation Prediction
Long-lead ENSO Forecast
Adaptation and Mitigation
Urban Planning and Climate Adaption
...and 4 more sections

Figures (12)

Figure 1: Schematic of the relationship between the foundation model and different types of the tasks in the atmospheric science
Figure 2: Distribution of Global sea surface temperature Data on May 1, 2024, at 00:00:00. The green boundary indicates the region of the subpolar gyre. The data is sent to GPT-4o, and GPT-4o correctly processes this information. Refer to Section \ref{['sec:2.1']} for detailed discussions.
Figure 3: Distribution of the sea surface temperature trend from 2001 to 2020. The plotted data is generated by GPT-4o, which correctly calculates the trend. Refer to Section \ref{['sec:2.2']} for detailed discussions.
Figure 4: Three EOF patterns generated by GPT-4o. Shading represents the sea level pressure for each pattern. The variance fraction of each EOF pattern is indicated at the top of each panel. The plotted data is generated by GPT-4o. Refer to Section \ref{['sec:2.3']} for detailed discussions.
Figure 5: The region occurred with extreme precipitation generated by GPT-4o. Shading represents the total precipitation. The red box highlighted the region with extreme precipitation.
...and 7 more figures

On the Opportunities of (Re)-Exploring Atmospheric Science by Foundation Models: A Case Study

TL;DR

Abstract

On the Opportunities of (Re)-Exploring Atmospheric Science by Foundation Models: A Case Study

Authors

TL;DR

Abstract

Table of Contents

Figures (12)