Table of Contents
Fetching ...

Evaluating Deep Learning Approaches for Predictions in Unmonitored Basins with Continental-scale Stream Temperature Models

Jared D. Willard, Fabio Ciulla, Helen Weierbach, Vipin Kumar, Charuleka Varadharajan

TL;DR

The prediction of streamflows and other environmental variables in unmonitored basins is a grand challenge in hydrology and this research offers a comprehensive perspective on optimizing ML model design for accurate predictions in unmonitored regions.

Abstract

The prediction of streamflows and other environmental variables in unmonitored basins is a grand challenge in hydrology. Recent machine learning (ML) models can harness vast datasets for accurate predictions at large spatial scales. However, there are open questions regarding model design and data needed for inputs and training to improve performance. This study explores these questions while demonstrating the ability of deep learning models to make accurate stream temperature predictions in unmonitored basins across the conterminous United States. First, we compare top-down models that utilize data from a large number of basins with bottom-up methods that transfer ML models built on local sites, reflecting traditional regionalization techniques. We also evaluate an intermediary grouped modeling approach that categorizes sites based on regional co-location or similarity of catchment characteristics. Second, we evaluate trade-offs between model complexity, prediction accuracy, and applicability for more target locations by systematically removing inputs. We then examine model performance when additional training data becomes available due to reductions in input requirements. Our results suggest that top-down models significantly outperform bottom-up and grouped models. Moreover, it is possible to get acceptable accuracy by reducing both dynamic and static inputs enabling predictions for more sites with lower model complexity and computational needs. From detailed error analysis, we determined that the models are more accurate for sites primarily controlled by air temperatures compared to locations impacted by groundwater and dams. By addressing these questions, this research offers a comprehensive perspective on optimizing ML model design for accurate predictions in unmonitored regions.

Evaluating Deep Learning Approaches for Predictions in Unmonitored Basins with Continental-scale Stream Temperature Models

TL;DR

The prediction of streamflows and other environmental variables in unmonitored basins is a grand challenge in hydrology and this research offers a comprehensive perspective on optimizing ML model design for accurate predictions in unmonitored regions.

Abstract

The prediction of streamflows and other environmental variables in unmonitored basins is a grand challenge in hydrology. Recent machine learning (ML) models can harness vast datasets for accurate predictions at large spatial scales. However, there are open questions regarding model design and data needed for inputs and training to improve performance. This study explores these questions while demonstrating the ability of deep learning models to make accurate stream temperature predictions in unmonitored basins across the conterminous United States. First, we compare top-down models that utilize data from a large number of basins with bottom-up methods that transfer ML models built on local sites, reflecting traditional regionalization techniques. We also evaluate an intermediary grouped modeling approach that categorizes sites based on regional co-location or similarity of catchment characteristics. Second, we evaluate trade-offs between model complexity, prediction accuracy, and applicability for more target locations by systematically removing inputs. We then examine model performance when additional training data becomes available due to reductions in input requirements. Our results suggest that top-down models significantly outperform bottom-up and grouped models. Moreover, it is possible to get acceptable accuracy by reducing both dynamic and static inputs enabling predictions for more sites with lower model complexity and computational needs. From detailed error analysis, we determined that the models are more accurate for sites primarily controlled by air temperatures compared to locations impacted by groundwater and dams. By addressing these questions, this research offers a comprehensive perspective on optimizing ML model design for accurate predictions in unmonitored regions.

Paper Structure

This paper contains 39 sections, 5 equations, 34 figures, 20 tables.

Figures (34)

  • Figure 3: Depiction of the three different representations of catchment attributes used as inputs for the top-down models in Experiment 3.
  • Figure 4: Twelve panel plot showing the performance of the methods used in Experiment 1: LSTM_conus (top-down method), LSTM_regional and LSTM_cluster (grouped methods), and MTL (bottom-up method). The left and middle columns show the hexbin spatial distribution of per-site RMSE and mean bias values in $^{\circ}$C, where the color represents the median within each hexbin. The right column is a two dimensional histogram showing the distribution of individual stream temperature predictions across all sites. The color represents the count within each bin.
  • Figure 5: Plot showing the distribution of per-site RMSE values for LSTM_conus (top-down method), LSTM_regional and LSTM_cluster (grouped methods), and MTL (bottom-up method)
  • Figure 6: Bar chart displaying the permutation feature importance for various input features in the LSTM_Conus model from Experiment 1 measured by how much the RMSE increases relative to the RMSE calculated using all test observations (1.97$^{\circ}$C). Features are categorized into three groups: meteorology-related features (light blue), streamflow-related features (dark blue), and catchment attributes (red). Only importance values greater than 0.03$^{\circ}$C are shown to focus on features with a significant impact on the model's predictive accuracy. This threshold is set based on the standard deviation of the RMSE per individual member of the model ensemble, highlighting features whose impact is beyond the ensemble's inherent variability.
  • Figure 7: 9 panel plot comparing different representations of attributes, showing the spatial distribution and prediction performance of each model. The top row shows the default LSTM_conus, and the second and third rows show the models with the original attributes swapped for z-score-based attributes from ciulla2023data and expert-selected 21 GAGES-II attributes from rahmani2021deep respectively. The left and middle columns show the hexbin spatial distribution of per-site RMSE and mean bias values in $^{\circ}$C, where the color represents the median within each hexbin. The right column is a two dimensional histogram showing the distribution of individual stream temperature predictions across all sites. The color represents the count within each bin.
  • ...and 29 more figures