Mind the truncation gap: challenges of learning on dynamic graphs with recurrent architectures
João Bravo, Jacopo Bono, Pedro Saleiro, Hugo Ferreira, Pedro Bizarro
TL;DR
The paper tackles learning on continuous-time dynamic graphs using graph recurrent neural networks and identifies a truncation gap in backpropagation through time caused by batch-based training. It introduces a synthetic edge-regression task with node memory buffers of length $M$, where the target $y_k$ depends on the last elements of each endpoint, to quantify long-horizon dependencies. Experiments on both synthetic data and real-world dynamic-graph benchmarks (Reddit, Wikipedia, MOOC) show a consistent performance gap between full BPTT (F-BPTT) and truncated BPTT (TBPTT), with F-BPTT delivering meaningful gains. The authors discuss future directions beyond backpropagation, including unbiased online learning approximations, and argue for more research to unlock GRNNs' capacity for long-range temporal reasoning.
Abstract
Systems characterized by evolving interactions, prevalent in social, financial, and biological domains, are effectively modeled as continuous-time dynamic graphs (CTDGs). To manage the scale and complexity of these graph datasets, machine learning (ML) approaches have become essential. However, CTDGs pose challenges for ML because traditional static graph methods do not naturally account for event timings. Newer approaches, such as graph recurrent neural networks (GRNNs), are inherently time-aware and offer advantages over static methods for CTDGs. However, GRNNs face another issue: the short truncation of backpropagation-through-time (BPTT), whose impact has not been properly examined until now. In this work, we demonstrate that this truncation can limit the learning of dependencies beyond a single hop, resulting in reduced performance. Through experiments on a novel synthetic task and real-world datasets, we reveal a performance gap between full backpropagation-through-time (F-BPTT) and the truncated backpropagation-through-time (T-BPTT) commonly used to train GRNN models. We term this gap the "truncation gap" and argue that understanding and addressing it is essential as the importance of CTDGs grows, discussing potential future directions for research in this area.
