TimeGraphs: Graph-based Temporal Reasoning
Paridhi Maheshwari, Hongyu Ren, Yanan Wang, Rok Sosic, Jure Leskovec
TL;DR
TimeGraphs tackles temporal reasoning in dynamic multi-agent settings where information is unevenly distributed over time. It builds a hierarchical temporal knowledge graph from frame-wise scene graphs, using a self-supervised VIPool-based event model (with mutual-information objectives) to extract multi-scale events and enable adaptive reasoning via Graph Cross Networks. A downstream Relational-GCN classifier performs event prediction or recognition on the fused multi-level graph, with an optional end-to-end training regime that balances hierarchy and classification losses. Across Football, Resistance, and MOMA datasets, TimeGraphs achieves state-of-the-art results, including up to 12.2% relative improvements in EM and demonstrates robustness to data sparsity, zero-shot generalization, and streaming data, illustrating practical impact for real-time temporal reasoning on complex scenes.
Abstract
Many real-world systems exhibit temporal, dynamic behaviors, which are captured as time series of complex agent interactions. To perform temporal reasoning, current methods primarily encode temporal dynamics through simple sequence-based models. However, in general these models fail to efficiently capture the full spectrum of rich dynamics in the input, since the dynamics is not uniformly distributed. In particular, relevant information might be harder to extract and computing power is wasted for processing all individual timesteps, even if they contain no significant changes or no new information. Here we propose TimeGraphs, a novel approach that characterizes dynamic interactions as a hierarchical temporal graph, diverging from traditional sequential representations. Our approach models the interactions using a compact graph-based representation, enabling adaptive reasoning across diverse time scales. Adopting a self-supervised method, TimeGraphs constructs a multi-level event hierarchy from a temporal input, which is then used to efficiently reason about the unevenly distributed dynamics. This construction process is scalable and incremental to accommodate streaming data. We evaluate TimeGraphs on multiple datasets with complex, dynamic agent interactions, including a football simulator, the Resistance game, and the MOMA human activity dataset. The results demonstrate both robustness and efficiency of TimeGraphs on a range of temporal reasoning tasks. Our approach obtains state-of-the-art performance and leads to a performance increase of up to 12.2% on event prediction and recognition tasks over current approaches. Our experiments further demonstrate a wide array of capabilities including zero-shot generalization, robustness in case of data sparsity, and adaptability to streaming data flow.
