Table of Contents
Fetching ...

Leveraging Exogenous Signals for Hydrology Time Series Forecasting

Junyang He, Judy Fox, Alireza Jafari, Ying-Jung Chen, Geoffrey Fox

TL;DR

The paper argues that hydrological time series forecasting benefits from integrating domain knowledge, exogenous drivers, and physical constraints rather than relying solely on general-purpose foundation models. It compares a domain-informed LSTM with static features and temporal/spatial encodings against foundation models on the CAMELS-US dataset, finding that the domain-aware approach outperforms zero-shot foundation models. Key contributions include a detailed dataset preprocessing and encoding strategy, a rigorous spatial validation setup, and evidence that exogenous inputs and seasonal encodings substantially improve predictive skill. The work highlights the need for science-specific foundation models that balance accuracy with interpretability in hydrology and related domains.

Abstract

Recent advances in time series research facilitate the development of foundation models. While many state-of-the-art time series foundation models have been introduced, few studies examine their effectiveness in specific downstream applications in physical science. This work investigates the role of integrating domain knowledge into time series models for hydrological rainfall-runoff modeling. Using the CAMELS-US dataset, which includes rainfall and runoff data from 671 locations with six time series streams and 30 static features, we compare baseline and foundation models. Results demonstrate that models incorporating comprehensive known exogenous inputs outperform more limited approaches, including foundation models. Notably, incorporating natural annual periodic time series contribute the most significant improvements.

Leveraging Exogenous Signals for Hydrology Time Series Forecasting

TL;DR

The paper argues that hydrological time series forecasting benefits from integrating domain knowledge, exogenous drivers, and physical constraints rather than relying solely on general-purpose foundation models. It compares a domain-informed LSTM with static features and temporal/spatial encodings against foundation models on the CAMELS-US dataset, finding that the domain-aware approach outperforms zero-shot foundation models. Key contributions include a detailed dataset preprocessing and encoding strategy, a rigorous spatial validation setup, and evidence that exogenous inputs and seasonal encodings substantially improve predictive skill. The work highlights the need for science-specific foundation models that balance accuracy with interpretability in hydrology and related domains.

Abstract

Recent advances in time series research facilitate the development of foundation models. While many state-of-the-art time series foundation models have been introduced, few studies examine their effectiveness in specific downstream applications in physical science. This work investigates the role of integrating domain knowledge into time series models for hydrological rainfall-runoff modeling. Using the CAMELS-US dataset, which includes rainfall and runoff data from 671 locations with six time series streams and 30 static features, we compare baseline and foundation models. Results demonstrate that models incorporating comprehensive known exogenous inputs outperform more limited approaches, including foundation models. Notably, incorporating natural annual periodic time series contribute the most significant improvements.

Paper Structure

This paper contains 12 sections, 3 figures, 1 table.

Figures (3)

  • Figure 1: LSTM model pipeline.
  • Figure 2: Effects of spatial temporal encodings and static features (validation loss). Left: RMSE by forecasted variable across four model configurations. Right: RMSE by spatial–temporal encoding configuration (Just TS = time series only; Extra = additional Fourier + Legendre encodings)
  • Figure 3: Timeseriesviz package example output. Figure shows LSTM, Chronos-bolt, and Sundial model performances on streamflow data.