Table of Contents
Fetching ...

Hydra-LSTM: A semi-shared Machine Learning architecture for prediction across Watersheds

Karan Ruparell, Robert J. Marks, Andy Wood, Kieran M. R. Hunt, Hannah L. Cloke, Christel Prudhomme, Florian Pappenberger, Matthew Chantry

TL;DR

The ability of the Hydra-LSTM to incorporate catchment-specific data by introducing historical river discharge as a catchment-specific input is tested, outperforming state-of-the-art models without needing to train an entirely new model.

Abstract

Long Short Term Memory networks (LSTMs) are used to build single models that predict river discharge across many catchments. These models offer greater accuracy than models trained on each catchment independently if using the same data. However, the same data is rarely available for all catchments. This prevents the use of variables available only in some catchments, such as historic river discharge or upstream discharge. The only existing method that allows for optional variables requires all variables to be considered in the initial training of the model, limiting its transferability to new catchments. To address this limitation, we develop the Hydra-LSTM. The Hydra-LSTM processes variables used across all catchments and variables used in only some catchments separately to allow general training and use of catchment-specific data in individual catchments. The bulk of the model can be shared across catchments, maintaining the benefits of multi-catchment models to generalise, while also benefitting from the advantages of using bespoke data. We apply this methodology to 1 day-ahead river discharge prediction in the Western US, as next-day river discharge prediction is the first step towards prediction across longer time scales. We obtain state-of-the-art performance, generating more accurate median and quantile predictions than Multi-Catchment and Single-Catchment LSTMs while allowing local forecasters to easily introduce and remove variables from their prediction set. We test the ability of the Hydra-LSTM to incorporate catchment-specific data by introducing historical river discharge as a catchment-specific input, outperforming state-of-the-art models without needing to train an entirely new model.

Hydra-LSTM: A semi-shared Machine Learning architecture for prediction across Watersheds

TL;DR

The ability of the Hydra-LSTM to incorporate catchment-specific data by introducing historical river discharge as a catchment-specific input is tested, outperforming state-of-the-art models without needing to train an entirely new model.

Abstract

Long Short Term Memory networks (LSTMs) are used to build single models that predict river discharge across many catchments. These models offer greater accuracy than models trained on each catchment independently if using the same data. However, the same data is rarely available for all catchments. This prevents the use of variables available only in some catchments, such as historic river discharge or upstream discharge. The only existing method that allows for optional variables requires all variables to be considered in the initial training of the model, limiting its transferability to new catchments. To address this limitation, we develop the Hydra-LSTM. The Hydra-LSTM processes variables used across all catchments and variables used in only some catchments separately to allow general training and use of catchment-specific data in individual catchments. The bulk of the model can be shared across catchments, maintaining the benefits of multi-catchment models to generalise, while also benefitting from the advantages of using bespoke data. We apply this methodology to 1 day-ahead river discharge prediction in the Western US, as next-day river discharge prediction is the first step towards prediction across longer time scales. We obtain state-of-the-art performance, generating more accurate median and quantile predictions than Multi-Catchment and Single-Catchment LSTMs while allowing local forecasters to easily introduce and remove variables from their prediction set. We test the ability of the Hydra-LSTM to incorporate catchment-specific data by introducing historical river discharge as a catchment-specific input, outperforming state-of-the-art models without needing to train an entirely new model.

Paper Structure

This paper contains 16 sections, 3 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Comparison of different LSTMs for hydrological modelling regarding their data requirements, usability across catchments, and flexibility in adding new variables. The architectures include Single-Catchment LSTMs, Multi-Catchment LSTM without River Discharge, Multi-Catchment LSTM with River Discharge, Flag LSTM, and Hydra-LSTM. The checkmarks indicate the presence of a feature, the crosses indicate the absence of a feature, and the circles indicate partial usability.
  • Figure 2: Diagram of Hydra Model Architecture. The leftmost box plots the time series data available at all catchments, including historical data, forecast data, and static catchment attributes. An encoding LSTM processes these, dubbed the Hydra Body, which produces a lower dimensional encoding of the information. If no further data is available, this encoding is passed to the Multi-Catchment Head, an LSTM that transforms the encoding to quantile discharge predictions. However, if further information is available for that catchment, it is passed to a Single-Catchment Head alongside the additional time series data, which are then combined to produce quantile discharge predictions.
  • Figure 3: Plot of catchment sites evaluated in this study, and the mean annual temperature over each catchment in Kelvin. All catchments are located in the Western US, and the plot shows the western US with an inlet showing a map of US as a whole
  • Figure 4: Training flow chart for Hydra-LSTM. The Multi-Catchment Head and Hydra Body are trained initially, and then the Single-Catchment Heads can be trained using the outputs from the Hydra Body as some of its inputs
  • Figure 5: Cumulative Distribution Plot showing the range of Cumulative Quantile Efficiency scores (CQES) \ref{['eq: Cumulative Quantile Efficiency Score']}, for each model trained without River Discharge as an input. Each individual score is for a single year in a single basin.
  • ...and 2 more figures