Incorporating Domain Differential Equations into Graph Convolutional Networks to Lower Generalization Discrepancy

Yue Sun; Chao Chen; Yuesheng Xu; Sihong Xie; Rick S. Blum; Parv Venkitasubramaniam

Incorporating Domain Differential Equations into Graph Convolutional Networks to Lower Generalization Discrepancy

Yue Sun, Chao Chen, Yuesheng Xu, Sihong Xie, Rick S. Blum, Parv Venkitasubramaniam

TL;DR

This work tackles domain generalization in graph-based time-series forecasting by embedding domain-specific ordinary differential equations into Graph Convolutional Networks. The authors formalize a domain-discrepancy framework and prove that a domain-ODE-informed hypothesis class focusing on immediate, local dynamics yields lower generalization error under distribution shifts, compared with domain-agnostic baselines. They instantiate this approach with two architectures, RDGCN and SIRGCN, to model traffic speed and influenza-like illness spread, respectively, demonstrating robustness to mismatched training/testing conditions and requiring fewer training samples due to domain grounding. Empirically, RDGCN and SIRGCN outperform several strong baselines under mismatched data across multiple datasets and show favorable efficiency, with RDGCN achieving strong robustness while using fewer parameters. The results illuminate the value of integrating domain knowledge via ODEs into graph-based time-series models, with implications for broader domains beyond traffic and epidemiology.

Abstract

Ensuring both accuracy and robustness in time series prediction is critical to many applications, ranging from urban planning to pandemic management. With sufficient training data where all spatiotemporal patterns are well-represented, existing deep-learning models can make reasonably accurate predictions. However, existing methods fail when the training data are drawn from different circumstances (e.g., traffic patterns on regular days) compared to test data (e.g., traffic patterns after a natural disaster). Such challenges are usually classified under domain generalization. In this work, we show that one way to address this challenge in the context of spatiotemporal prediction is by incorporating domain differential equations into Graph Convolutional Networks (GCNs). We theoretically derive conditions where GCNs incorporating such domain differential equations are robust to mismatched training and testing data compared to baseline domain agnostic models. To support our theory, we propose two domain-differential-equation-informed networks called Reaction-Diffusion Graph Convolutional Network (RDGCN), which incorporates differential equations for traffic speed evolution, and Susceptible-Infectious-Recovered Graph Convolutional Network (SIRGCN), which incorporates a disease propagation model. Both RDGCN and SIRGCN are based on reliable and interpretable domain differential equations that allow the models to generalize to unseen patterns. We experimentally show that RDGCN and SIRGCN are more robust with mismatched testing data than the state-of-the-art deep learning methods.

Incorporating Domain Differential Equations into Graph Convolutional Networks to Lower Generalization Discrepancy

TL;DR

Abstract

Paper Structure (23 sections, 4 theorems, 32 equations, 10 figures, 6 tables)

This paper contains 23 sections, 4 theorems, 32 equations, 10 figures, 6 tables.

Introduction
Related Work
Problem Definition
Methodology
Proof of Robustness to Domain Generalization
Application of Domain-ODE informed GCNs
Evaluation
Experiment Settings
Results and analysis
Conclusion
Acknowledgements
Proofs
Proof of Lemma 2
Proof of Theorem 3
Discrepancy using MSE
...and 8 more sections

Key Result

Lemma 1

$h_1^\ast(X_{t-T:t}) = F(O(X(t), \mathcal{A});\Theta_1)+G_s(O(X(t), \mathbb{I} - \mathcal{A}), X_{t-T:t-1};\Theta_2)$.

Figures (10)

Figure 1: (a) Two collections of patterns (i.e., pattern A exists in a known dataset) and pattern B (difficult to be collected and only available at test time) in the training and test datasets have no overlap; (b) Without incorporating a domain ODE, testing the model with such mismatched patterns may result in poorer accuracy; (c) With an architecture incorporating a domain ODE, the model is able to achieve good accuracy given unseen patterns.
Figure 2: (a) The results of RDGCN are very close regardless of the period of the training set. (b) Even though all the models are trained using all available weekdays, the results of RDGCN are still closer regardless of the period, compared to baseline models. The numerical result, the plot in the other three time windows, and the corresponding result for RMSE are in Ablation Study in the appendix.
Figure 3: (a) The pdf of the random variable, $G$ is symmetric about 0 for all the time periods. Figures in the first row are the mixed distribution of all sensors. Figures in the following three rows are the distribution of three randomly selected sensors in each dataset. (b) The pdf of the random variable, $G$ is symmetric about 0 for all seasons. We randomly select 3 vertices in each data set.
Figure 4: When an STGCN is tested on dataset from a matching distribution, the most important sensors (orange markers) are near the target sesnor, whereas the most important sensors under mismatched data (red markers) for the traffic speed prediction at target sensor (blue marker) are located far away. However, under matched data, the most important sensors are often close to the target sensor.
Figure 5: (a) Diffusion occurs in the direction of a road segment; (b) reaction occurs opposite to the direction of a road segment.
...and 5 more figures

Theorems & Definitions (8)

Lemma 1
proof
Lemma 2
Theorem 3
proof
proof
Corollary 1
proof

Incorporating Domain Differential Equations into Graph Convolutional Networks to Lower Generalization Discrepancy

TL;DR

Abstract

Incorporating Domain Differential Equations into Graph Convolutional Networks to Lower Generalization Discrepancy

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (8)