Event-CausNet: Unlocking Causal Knowledge from Text with Large Language Models for Reliable Spatio-Temporal Forecasting
Luyao Niu, Zepu Wang, Shuyi Guan, Yang Liu, Peng Sun
TL;DR
Event-CausNet addresses the fragility of spatio-temporal GNNs during non-recurrent disruptions by injecting explicit causal knowledge derived from unstructured event reports. The method combines an offline LLM-based feature extraction pipeline, a causal knowledge base built via ATE estimation with Propensity Score Matching, and an online Causal-Enhanced Prediction Network that fuses base spatiotemporal forecasts with a dynamic causal adjustment through a causal-aware attention mechanism. It reports up to 35.87% MAE reduction over state-of-the-art baselines on the BjTT dataset, along with strong interpretability through attention analysis and causal explanations. This work narrows the gap between correlational forecasting and causal reasoning, enabling more reliable, transferable traffic management during urban disruptions while offering transparent reasons behind predictions.
Abstract
While spatio-temporal Graph Neural Networks (GNNs) excel at modeling recurring traffic patterns, their reliability plummets during non-recurring events like accidents. This failure occurs because GNNs are fundamentally correlational models, learning historical patterns that are invalidated by the new causal factors introduced during disruptions. To address this, we propose Event-CausNet, a framework that uses a Large Language Model to quantify unstructured event reports, builds a causal knowledge base by estimating average treatment effects, and injects this knowledge into a dual-stream GNN-LSTM network using a novel causal attention mechanism to adjust and enhance the forecast. Experiments on a real-world dataset demonstrate that Event-CausNet achieves robust performance, reducing prediction error (MAE) by up to 35.87%, significantly outperforming state-of-the-art baselines. Our framework bridges the gap between correlational models and causal reasoning, providing a solution that is more accurate and transferable, while also offering crucial interpretability, providing a more reliable foundation for real-world traffic management during critical disruptions.
