Table of Contents
Fetching ...

Spatiotemporal Forecasting in Climate Data Using EOFs and Machine Learning Models: A Case Study in Chile

Mauricio Herrera, Francisca Kleisinger, Andrés Wilsón

TL;DR

The paper tackles forecasting in Chile's highly variable climate by merging EOF-based dimensionality reduction with wavelet time-frequency analysis and machine learning to forecast EOF temporal modes $φ_k(t)$ to horizon $h$ and reconstruct the spatiotemporal field. It introduces DTW-based clustering to identify regionally coherent patterns and uses cluster medoids to make practical, localized forecasts with reduced computational burden. The Wavelet–ANN Hybrid Model outperforms baseline autoregressive approaches in predicting EOF dynamics, enabling a transition from 6355 time series to a manageable set of modes while maintaining useful predictive skill. This approach supports region-specific resource planning and climate adaptation by delivering scalable, interpretable, medium-range forecasts across Chile, with potential extensions to DL-based spatial interpolation for full-field reconstruction.

Abstract

Effective resource management and environmental planning in regions with high climatic variability, such as Chile, demand advanced predictive tools. This study addresses this challenge by employing an innovative and computationally efficient hybrid methodology that integrates machine learning (ML) methods for time series forecasting with established statistical techniques. The spatiotemporal data undergo decomposition using time-dependent Empirical Orthogonal Functions (EOFs), denoted as \(φ_{k}(t)\), and their corresponding spatial coefficients, \(α_{k}(s)\), to reduce dimensionality. Wavelet analysis provides high-resolution time and frequency information from the \(φ_{k}(t)\) functions, while neural networks forecast these functions within a medium-range horizon \(h\). By utilizing various ML models, particularly a Wavelet - ANN hybrid model, we forecast \(φ_{k}(t+h)\) up to a time horizon \(h\), and subsequently reconstruct the spatiotemporal data using these extended EOFs. This methodology is applied to a grid of climate data covering the territory of Chile. It transitions from a high-dimensional multivariate spatiotemporal data forecasting problem to a low-dimensional univariate forecasting problem. Additionally, cluster analysis with Dynamic Time Warping for defining similarities between rainfall time series, along with spatial coherence and predictability assessments, has been instrumental in identifying geographic areas where model performance is enhanced. This approach also elucidates the reasons behind poor forecast performance in regions or clusters with low spatial coherence and predictability. By utilizing cluster medoids, the forecasting process becomes more practical and efficient. This compound approach significantly reduces computational complexity while generating forecasts of reasonable accuracy and utility.

Spatiotemporal Forecasting in Climate Data Using EOFs and Machine Learning Models: A Case Study in Chile

TL;DR

The paper tackles forecasting in Chile's highly variable climate by merging EOF-based dimensionality reduction with wavelet time-frequency analysis and machine learning to forecast EOF temporal modes to horizon and reconstruct the spatiotemporal field. It introduces DTW-based clustering to identify regionally coherent patterns and uses cluster medoids to make practical, localized forecasts with reduced computational burden. The Wavelet–ANN Hybrid Model outperforms baseline autoregressive approaches in predicting EOF dynamics, enabling a transition from 6355 time series to a manageable set of modes while maintaining useful predictive skill. This approach supports region-specific resource planning and climate adaptation by delivering scalable, interpretable, medium-range forecasts across Chile, with potential extensions to DL-based spatial interpolation for full-field reconstruction.

Abstract

Effective resource management and environmental planning in regions with high climatic variability, such as Chile, demand advanced predictive tools. This study addresses this challenge by employing an innovative and computationally efficient hybrid methodology that integrates machine learning (ML) methods for time series forecasting with established statistical techniques. The spatiotemporal data undergo decomposition using time-dependent Empirical Orthogonal Functions (EOFs), denoted as \(φ_{k}(t)\), and their corresponding spatial coefficients, \(α_{k}(s)\), to reduce dimensionality. Wavelet analysis provides high-resolution time and frequency information from the \(φ_{k}(t)\) functions, while neural networks forecast these functions within a medium-range horizon . By utilizing various ML models, particularly a Wavelet - ANN hybrid model, we forecast \(φ_{k}(t+h)\) up to a time horizon , and subsequently reconstruct the spatiotemporal data using these extended EOFs. This methodology is applied to a grid of climate data covering the territory of Chile. It transitions from a high-dimensional multivariate spatiotemporal data forecasting problem to a low-dimensional univariate forecasting problem. Additionally, cluster analysis with Dynamic Time Warping for defining similarities between rainfall time series, along with spatial coherence and predictability assessments, has been instrumental in identifying geographic areas where model performance is enhanced. This approach also elucidates the reasons behind poor forecast performance in regions or clusters with low spatial coherence and predictability. By utilizing cluster medoids, the forecasting process becomes more practical and efficient. This compound approach significantly reduces computational complexity while generating forecasts of reasonable accuracy and utility.

Paper Structure

This paper contains 11 sections, 10 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Solution of Seven Clusters Using Hierarchical Method and DTW -- Based Distance. Records are segmented into ten-year intervals to illustrate stability versus slight variations in cluster structure.
  • Figure 2: Cluster--Specific Precipitation Patterns from 2018 to 2022
  • Figure 3: DOF and $var(SAI)$ metrics for Rainfall amount (RAm) by clusters (1 - 7). Groups "All", "0 -- 500m", "500 -- 1500m" and "$>500$m" are included.
  • Figure 4: Comparison of the top 10 spatial coefficients from EOF decomposition of precipitation data for 2018–2021 (panel A) and data including 2022 (panel B).
  • Figure 5: The first three EOF functions (in black) and their predictions (in red). The top three panels display predictions of the EOFs using a LSTM--Recurrent Neural Network with ten hidden layers, while the bottom three panels depict predictions using the Wavelet--ANN Hybrid Model.
  • ...and 1 more figures