Table of Contents
Fetching ...

Physics Guided Machine Learning Methods for Hydrology

Ankush Khandelwal, Shaoming Xu, Xiang Li, Xiaowei Jia, Michael Stienbach, Christopher Duffy, John Nieber, Vipin Kumar

TL;DR

The paper tackles the challenge of streamflow prediction by integrating physics-informed insights into neural models, explicitly modeling intermediate hydrological states and fluxes via multi-task learning. It introduces three architectures—MTL, SA-MTL, and H-SA-MTL—trained with intermediate variables but predicting with weather inputs alone, and employs self-paced learning to balance task losses. On a 1000-year SWAT-generated dataset, the approach yields substantial RMSE gains, with the hierarchical state-aware model achieving the best performance and notable improvements in soil moisture handling, though snowpack remains difficult. This work demonstrates the potential of physics-guided, hierarchical learning to enhance hydrological predictions and lays out future directions such as mass-conservation constraints and cross-catchment pretraining to improve generalization to real observations.

Abstract

Streamflow prediction is one of the key challenges in the field of hydrology due to the complex interplay between multiple non-linear physical mechanisms behind streamflow generation. While physics based models are rooted in rich understanding of the physical processes, a significant performance gap still remains which can be potentially addressed by leveraging the recent advances in machine learning. The goal of this work is to incorporate our understanding of hydrological processes and constraints into machine learning algorithms to improve the predictive performance. Traditional ML models for this problem predict streamflow using weather drivers as input. However there are multiple intermediate processes that interact to generate streamflow from weather drivers. The key idea of the approach is to explicitly model these intermediate processes that connect weather drivers to streamflow using a multi-task learning framework. While our proposed approach requires data about intermediate processes during training, only weather drivers will be needed to predict the streamflow during testing phase. We assess the efficacy of the approach on a simulation dataset generated by the SWAT model for a catchment located in the South Branch of the Root River Watershed in southeast Minnesota. While the focus of this paper is on improving the performance given data from a single catchment, methodology presented here is applicable to ML-based approaches that use data from multiple catchments to improve performance of each individual catchment.

Physics Guided Machine Learning Methods for Hydrology

TL;DR

The paper tackles the challenge of streamflow prediction by integrating physics-informed insights into neural models, explicitly modeling intermediate hydrological states and fluxes via multi-task learning. It introduces three architectures—MTL, SA-MTL, and H-SA-MTL—trained with intermediate variables but predicting with weather inputs alone, and employs self-paced learning to balance task losses. On a 1000-year SWAT-generated dataset, the approach yields substantial RMSE gains, with the hierarchical state-aware model achieving the best performance and notable improvements in soil moisture handling, though snowpack remains difficult. This work demonstrates the potential of physics-guided, hierarchical learning to enhance hydrological predictions and lays out future directions such as mass-conservation constraints and cross-catchment pretraining to improve generalization to real observations.

Abstract

Streamflow prediction is one of the key challenges in the field of hydrology due to the complex interplay between multiple non-linear physical mechanisms behind streamflow generation. While physics based models are rooted in rich understanding of the physical processes, a significant performance gap still remains which can be potentially addressed by leveraging the recent advances in machine learning. The goal of this work is to incorporate our understanding of hydrological processes and constraints into machine learning algorithms to improve the predictive performance. Traditional ML models for this problem predict streamflow using weather drivers as input. However there are multiple intermediate processes that interact to generate streamflow from weather drivers. The key idea of the approach is to explicitly model these intermediate processes that connect weather drivers to streamflow using a multi-task learning framework. While our proposed approach requires data about intermediate processes during training, only weather drivers will be needed to predict the streamflow during testing phase. We assess the efficacy of the approach on a simulation dataset generated by the SWAT model for a catchment located in the South Branch of the Root River Watershed in southeast Minnesota. While the focus of this paper is on improving the performance given data from a single catchment, methodology presented here is applicable to ML-based approaches that use data from multiple catchments to improve performance of each individual catchment.

Paper Structure

This paper contains 7 sections, 8 figures.

Figures (8)

  • Figure 1: A graphical abstraction of the hydrological cycle. Intermediate variables are represented by dashed circles. Red color denotes state variables where blue color denotes fluxes.
  • Figure 2: A physics-guided deep learning framework for estimating streamflow.
  • Figure 3: Physics guided deep learning architecture for estimating streamflow. Red color denotes state variables where blue color denotes fluxes.
  • Figure 4: A simulation timeseries of Soil water timeseries from
  • Figure 5: An illustration to depict how initial values are passed while training and prediction.
  • ...and 3 more figures