CauSTream: Causal Spatio-Temporal Representation Learning for Streamflow Forecasting
Shu Wan, Reepal Shah, John Sabo, Huan Liu, K. Selçuk Candan
TL;DR
<3-5 sentence high-level summary> CauStream addresses the challenge of accurate and interpretable long-horizon streamflow forecasting by jointly learning causal spatiotemporal structures. It introduces two hydrology-guided DAGs—the instantaneous forcing DAG G_F and the ell-windowed routing DAG G_Q—and grounds identifiability in nonlinear ICA, enabling data-driven discovery of physically meaningful graphs. The framework combines causal representation learning with a spatiotemporal forecasting backbone and imposes sparsity and acyclicity to yield interpretable graphs that align with hydrological knowledge, while producing state-of-the-art forecasts across three large U.S. basins. Runoff embeddings are validated against VIC simulations, and ablations confirm the value of each component, highlighting CauStream's potential for robust, explainable watershed analysis and decision support.
Abstract
Streamflow forecasting is crucial for water resource management and risk mitigation. While deep learning models have achieved strong predictive performance, they often overlook underlying physical processes, limiting interpretability and generalization. Recent causal learning approaches address these issues by integrating domain knowledge, yet they typically rely on fixed causal graphs that fail to adapt to data. We propose CauStream, a unified framework for causal spatiotemporal streamflow forecasting. CauSTream jointly learns (i) a runoff causal graph among meteorological forcings and (ii) a routing graph capturing dynamic dependencies across stations. We further establish identifiability conditions for these causal structures under a nonparametric setting. We evaluate CauSTream on three major U.S. river basins across three forecasting horizons. The model consistently outperforms prior state-of-the-art methods, with performance gaps widening at longer forecast windows, indicating stronger generalization to unseen conditions. Beyond forecasting, CauSTream also learns causal graphs that capture relationships among hydrological factors and stations. The inferred structures align closely with established domain knowledge, offering interpretable insights into watershed dynamics. CauSTream offers a principled foundation for causal spatiotemporal modeling, with the potential to extend to a wide range of scientific and environmental applications.
