Table of Contents
Fetching ...

Regional Weather Variable Predictions by Machine Learning with Near-Surface Observational and Atmospheric Numerical Data

Yihe Zhang, Bryce Turney, Purushottam Sigdel, Xu Yuan, Eric Rappin, Adrian Lago, Sytske Kimball, Li Chen, Paul Darby, Lu Peng, Sercan Aygun, Yazhou Tu, M. Hassan Najafi, Nian-Feng Tzeng

TL;DR

This work tackles the challenge of producing accurate, fine-grained regional weather forecasts by fusing high-frequency near-surface observations with coarse-grained atmospheric numerics. The authors introduce MiMa, an encoder–decoder Transformer framework with two encoders (micro and macro) and a decoder, yielding modelets that predict a single weather parameter at a given location; they also extend this to Re-MiMa for ungauged locations via regional transfer learning using elevations. Across Kentucky Mesonet stations and four weather parameters, MiMa outperforms baselines, and Re-MiMa achieves region-wide accuracy at ungauged sites, demonstrating a practical pathway to high-resolution regional nowcasting. The approach has strong implications for real-time decision-making in sectors like transportation and emergency management, and code/data availability supports replication and regional adaptation of the method.

Abstract

Accurate and timely regional weather prediction is vital for sectors dependent on weather-related decisions. Traditional prediction methods, based on atmospheric equations, often struggle with coarse temporal resolutions and inaccuracies. This paper presents a novel machine learning (ML) model, called MiMa (short for Micro-Macro), that integrates both near-surface observational data from Kentucky Mesonet stations (collected every five minutes, known as Micro data) and hourly atmospheric numerical outputs (termed as Macro data) for fine-resolution weather forecasting. The MiMa model employs an encoder-decoder transformer structure, with two encoders for processing multivariate data from both datasets and a decoder for forecasting weather variables over short time horizons. Each instance of the MiMa model, called a modelet, predicts the values of a specific weather parameter at an individual Mesonet station. The approach is extended with Re-MiMa modelets, which are designed to predict weather variables at ungauged locations by training on multivariate data from a few representative stations in a region, tagged with their elevations. Re-MiMa (short for Regional-MiMa) can provide highly accurate predictions across an entire region, even in areas without observational stations. Experimental results show that MiMa significantly outperforms current models, with Re-MiMa offering precise short-term forecasts for ungauged locations, marking a significant advancement in weather forecasting accuracy and applicability.

Regional Weather Variable Predictions by Machine Learning with Near-Surface Observational and Atmospheric Numerical Data

TL;DR

This work tackles the challenge of producing accurate, fine-grained regional weather forecasts by fusing high-frequency near-surface observations with coarse-grained atmospheric numerics. The authors introduce MiMa, an encoder–decoder Transformer framework with two encoders (micro and macro) and a decoder, yielding modelets that predict a single weather parameter at a given location; they also extend this to Re-MiMa for ungauged locations via regional transfer learning using elevations. Across Kentucky Mesonet stations and four weather parameters, MiMa outperforms baselines, and Re-MiMa achieves region-wide accuracy at ungauged sites, demonstrating a practical pathway to high-resolution regional nowcasting. The approach has strong implications for real-time decision-making in sectors like transportation and emergency management, and code/data availability supports replication and regional adaptation of the method.

Abstract

Accurate and timely regional weather prediction is vital for sectors dependent on weather-related decisions. Traditional prediction methods, based on atmospheric equations, often struggle with coarse temporal resolutions and inaccuracies. This paper presents a novel machine learning (ML) model, called MiMa (short for Micro-Macro), that integrates both near-surface observational data from Kentucky Mesonet stations (collected every five minutes, known as Micro data) and hourly atmospheric numerical outputs (termed as Macro data) for fine-resolution weather forecasting. The MiMa model employs an encoder-decoder transformer structure, with two encoders for processing multivariate data from both datasets and a decoder for forecasting weather variables over short time horizons. Each instance of the MiMa model, called a modelet, predicts the values of a specific weather parameter at an individual Mesonet station. The approach is extended with Re-MiMa modelets, which are designed to predict weather variables at ungauged locations by training on multivariate data from a few representative stations in a region, tagged with their elevations. Re-MiMa (short for Regional-MiMa) can provide highly accurate predictions across an entire region, even in areas without observational stations. Experimental results show that MiMa significantly outperforms current models, with Re-MiMa offering precise short-term forecasts for ungauged locations, marking a significant advancement in weather forecasting accuracy and applicability.

Paper Structure

This paper contains 19 sections, 9 equations, 10 figures, 11 tables.

Figures (10)

  • Figure 1: Kentucky Mesonet weather observational stations denoted by yellow circles, with those stations chosen for MiMa model evaluation and pointed by red line segments tagged with their latitudes, longitudes, and elevations.
  • Figure 2: Overview of the MiMa (short for Micro-Macro) model inputted with data from both an individual station and WRF-HRRR modeling computation to yield the weather variable predictions.
  • Figure 3: Structure of the Micro model, with the hidden state $\textbf{H}_{t}$ obtained by inputting $\textbf{X}_{\text{micro}}$ to an encoder and the output $\textbf{O}_{t+1}$ obtained by inputting $\textbf{H}_{t}$ plus $\textbf{Y}_{0}$ to a decoder. Output $\textbf{O}_{t+1}$ is then passed to a fully connected layer which generates the predicted parameter value $\textbf{Y}_{t+1}$ via a fully connected network.
  • Figure 4: Structure of MiMa model, with its Micro Encoder and its Decoder identical to those depicted in Fig. \ref{['fig:MiED']} and with the hidden states of the Micro and the Macro Encoders concatenated as the Decoder's input.
  • Figure 6: Ensemble temperature prediction plot of MiMa modelet for Station FARM.
  • ...and 5 more figures