Toward Routing River Water in Land Surface Models with Recurrent Neural Networks
Mauricio Lima, Katherine Deck, Oliver R. A. Dunbar, Tapio Schneider
TL;DR
This study demonstrates that a runoff-driven LSTM can learn river routing within a global land surface model, achieving generalization across time and basins and outperforming a physics-based benchmark in many settings. By constructing a globally consistent dataset from HydroSHEDS, HydroATLAS, and ERA5-Land, the authors train an LSTM with inputs comprising daily basin runoff and static geographic attributes, and evaluate using NSE and KGE metrics. Key findings show improved time generalization with globally diverse data and reasonable basin generalization, along with notable performance gains over LISFLOOD in both time- and basin-split tests, though challenges remain in arid and data-poor regions. The work highlights practical steps toward integrating ML-based river routing into LSMs, discusses mass-balance considerations, and outlines future directions for inter-basin routing and mass-conserving architectures to enable global-scale applications.
Abstract
Machine learning is playing an increasing role in hydrology, supplementing or replacing physics-based models. One notable example is the use of recurrent neural networks (RNNs) for forecasting streamflow given observed precipitation and geographic characteristics. Training of such a model over the continental United States (CONUS) has demonstrated that a single set of model parameters can be used across independent catchments, and that RNNs can outperform physics-based models. In this work, we take a next step and study the performance of RNNs for river routing in land surface models (LSMs). Instead of observed precipitation, the LSM-RNN uses instantaneous runoff calculated from physics-based models as an input. We train the model with data from river basins spanning the globe and test it using historical streamflow measurements. The model demonstrates skill at generalization across basins (predicting streamflow in catchments not used in training) and across time (predicting streamflow during years not used in training). We compare the predictions from the LSM-RNN to an existing physics-based model calibrated with a similar dataset and find that the LSM-RNN outperforms the physics-based model: a gain in median NSE from 0.56 to 0.64 (time-split experiment) and from 0.30 to 0.34 (basin-split experiment). Our results show that RNNs are effective for global streamflow prediction from runoff inputs and motivate the development of complete routing models that can capture nested sub-basis connections.
