Interpretable Water Level Forecaster with Spatiotemporal Causal Attention Mechanisms
Sungchul Hong, Yunjin Choi, Jong-June Jeon
TL;DR
This paper tackles interpretable water level forecasting by embedding spatiotemporal causality into a transformer-based model, InstaTran. It introduces SCAN and TAN within a multilayer spatiotemporal causal graph, controlled by masks $M_{\\mathcal{S}}$ and $M_{\\mathcal{T}}$, and uses a direct multi-quantile decoder trained with composite quantile loss $CQL$. The key contributions include: (i) a principled masking scheme that yields interpretable attention weights aligned with domain knowledge, (ii) a spatiotemporal encoder that strengthens spatial causality, and (iii) robustness to distribution shifts demonstrated on the Han River dataset and US lake data. The approach delivers competitive probabilistic forecasts with interpretable variable importances, and improves generalization under covariate shift, offering practical utility for water resource management and disaster risk mitigation.
Abstract
Accurate forecasting of river water levels is vital for effectively managing traffic flow and mitigating the risks associated with natural disasters. This task presents challenges due to the intricate factors influencing the flow of a river. Recent advances in machine learning have introduced numerous effective forecasting methods. However, these methods lack interpretability due to their complex structure, resulting in limited reliability. Addressing this issue, this study proposes a deep learning model that quantifies interpretability, with an emphasis on water level forecasting. This model focuses on generating quantitative interpretability measurements, which align with the common knowledge embedded in the input data. This is facilitated by the utilization of a transformer architecture that is purposefully designed with masking, incorporating a multi-layer network that captures spatiotemporal causation. We perform a comparative analysis on the Han River dataset obtained from Seoul, South Korea, from 2016 to 2021. The results illustrate that our approach offers enhanced interpretability consistent with common knowledge, outperforming competing methods and also enhances robustness against distribution shift.
