Table of Contents
Fetching ...

ClimateLLM: Efficient Weather Forecasting via Frequency-Aware Large Language Models

Shixuan Li, Wei Yang, Peiyu Zhang, Xiongye Xiao, Defu Cao, Yuehan Qin, Xiaole Zhang, Yue Zhao, Paul Bogdan

TL;DR

ClimateLLM introduces a frequency-aware weather forecasting foundation model that combines FFT-based spectral decomposition with a mixture-of-experts, dynamic prompting, and a GPT-2 backbone to capture multi-scale spatiotemporal patterns. The approach addresses extreme-event modeling and efficiency by operating in the frequency domain, reusing pre-trained parameters, and applying latitude-weighted optimization. Extensive ERA5/WeatherBench2 experiments show superior ACC and RMSE across short- and long-horizon forecasts, with substantial efficiency gains and strong zero-/few-shot generalization. The work offers a scalable, deployable framework with potential for physics-informed extensions and advanced reasoning in future climate analytics.

Abstract

Weather forecasting is crucial for public safety, disaster prevention and mitigation, agricultural production, and energy management, with global relevance. Although deep learning has significantly advanced weather prediction, current methods face critical limitations: (i) they often struggle to capture both dynamic temporal dependencies and short-term abrupt changes, making extreme weather modeling difficult; (ii) they incur high computational costs due to extensive training and resource requirements; (iii) they have limited adaptability to multi-scale frequencies, leading to challenges when separating global trends from local fluctuations. To address these issues, we propose ClimateLLM, a foundation model for weather forecasting. It captures spatiotemporal dependencies via a cross-temporal and cross-spatial collaborative modeling framework that integrates Fourier-based frequency decomposition with Large Language Models (LLMs) to strengthen spatial and temporal modeling. Our framework uses a Mixture-of-Experts (MoE) mechanism that adaptively processes different frequency components, enabling efficient handling of both global signals and localized extreme events. In addition, we introduce a cross-temporal and cross-spatial dynamic prompting mechanism, allowing LLMs to incorporate meteorological patterns across multiple scales effectively. Extensive experiments on real-world datasets show that ClimateLLM outperforms state-of-the-art approaches in accuracy and efficiency, as a scalable solution for global weather forecasting.

ClimateLLM: Efficient Weather Forecasting via Frequency-Aware Large Language Models

TL;DR

ClimateLLM introduces a frequency-aware weather forecasting foundation model that combines FFT-based spectral decomposition with a mixture-of-experts, dynamic prompting, and a GPT-2 backbone to capture multi-scale spatiotemporal patterns. The approach addresses extreme-event modeling and efficiency by operating in the frequency domain, reusing pre-trained parameters, and applying latitude-weighted optimization. Extensive ERA5/WeatherBench2 experiments show superior ACC and RMSE across short- and long-horizon forecasts, with substantial efficiency gains and strong zero-/few-shot generalization. The work offers a scalable, deployable framework with potential for physics-informed extensions and advanced reasoning in future climate analytics.

Abstract

Weather forecasting is crucial for public safety, disaster prevention and mitigation, agricultural production, and energy management, with global relevance. Although deep learning has significantly advanced weather prediction, current methods face critical limitations: (i) they often struggle to capture both dynamic temporal dependencies and short-term abrupt changes, making extreme weather modeling difficult; (ii) they incur high computational costs due to extensive training and resource requirements; (iii) they have limited adaptability to multi-scale frequencies, leading to challenges when separating global trends from local fluctuations. To address these issues, we propose ClimateLLM, a foundation model for weather forecasting. It captures spatiotemporal dependencies via a cross-temporal and cross-spatial collaborative modeling framework that integrates Fourier-based frequency decomposition with Large Language Models (LLMs) to strengthen spatial and temporal modeling. Our framework uses a Mixture-of-Experts (MoE) mechanism that adaptively processes different frequency components, enabling efficient handling of both global signals and localized extreme events. In addition, we introduce a cross-temporal and cross-spatial dynamic prompting mechanism, allowing LLMs to incorporate meteorological patterns across multiple scales effectively. Extensive experiments on real-world datasets show that ClimateLLM outperforms state-of-the-art approaches in accuracy and efficiency, as a scalable solution for global weather forecasting.

Paper Structure

This paper contains 39 sections, 20 equations, 5 figures, 7 tables, 1 algorithm.

Figures (5)

  • Figure 1: Overall framework of the proposed ClimateLLM. (a) The two-dimensional time-series weather data $X_{\text{hist}}$ is transformed into the frequency domain via 2D FFT. (b) A Mixture-of-Experts approach adaptively learns different frequency components. (c) Learnable prompts at the weather variable and temporal levels perform cross-attention for meta fusion. (d) The prompts and frequency domain tokens are fed into an LLM to capture spatiotemporal patterns, yielding predictions $X_{\text{pred}}$.
  • Figure 2: Few-shot Forecasting Results, with training sample scale ranging from 10% to 100%.
  • Figure 3: Sensitivity analysis of GPT's number of layers.
  • Figure 4: Case Study of variable $t2m$. (a) True vlaue at $t_0$ (b) True value at $t_1$ (c) ClimateLLM prediction results at $t_1$ (d) The difference between true value at $t_0$ and true value at $t_1$ (e) The difference between prediction result at $t_1$ and true value at $t_1$.
  • Figure 5: Case Study of variable t