Table of Contents
Fetching ...

Integrating Temporal and Structural Context in Graph Transformers for Relational Deep Learning

Divyansha Lachi, Mahmoud Mohammadi, Joe Meyer, Vinam Arora, Tom Palczewski, Eva L. Dyer

TL;DR

The work addresses the challenge of integrating long-range temporal and relational context in heterogeneous graphs for multi-task prediction. It introduces a temporal subgraph sampler to capture temporally relevant relationships beyond immediate neighborhoods and a Relational Graph Perceiver (RGP) that uses a Perceiver-style cross-attention bottleneck to blend structural and temporal signals in a shared latent space, plus a flexible cross-attention decoder for tasks with disjoint label spaces. Key contributions include enabling scalable global context in relational graphs and unified multi-task learning within a single model, with empirical state-of-the-art results on RelBench, SALT, and CTU across binary, multi-class, and ranking tasks. The findings demonstrate both accuracy gains and compute efficiency, suggesting broad practical impact for relational deep learning in domains with rich temporal dynamics and diverse predictive objectives. This approach provides a foundation for scalable, multi-task relational models applicable to healthcare, finance, and e-commerce settings where temporal interactions drive decisions.

Abstract

In domains such as healthcare, finance, and e-commerce, the temporal dynamics of relational data emerge from complex interactions-such as those between patients and providers, or users and products across diverse categories. To be broadly useful, models operating on these data must integrate long-range spatial and temporal dependencies across diverse types of entities, while also supporting multiple predictive tasks. However, existing graph models for relational data primarily focus on spatial structure, treating temporal information merely as a filtering constraint to exclude future events rather than a modeling signal, and are typically designed for single-task prediction. To address these gaps, we introduce a temporal subgraph sampler that enhances global context by retrieving nodes beyond the immediate neighborhood to capture temporally relevant relationships. In addition, we propose the Relational Graph Perceiver (RGP), a graph transformer architecture for relational deep learning that leverages a cross-attention-based latent bottleneck to efficiently integrate information from both structural and temporal contexts. This latent bottleneck integrates signals from different node and edge types into a common latent space, enabling the model to build global context across the entire relational system. RGP also incorporates a flexible cross-attention decoder that supports joint learning across tasks with disjoint label spaces within a single model. Experiments on RelBench, SALT, and CTU show that RGP delivers state-of-the-art performance, offering a general and scalable solution for relational deep learning with support for diverse predictive tasks.

Integrating Temporal and Structural Context in Graph Transformers for Relational Deep Learning

TL;DR

The work addresses the challenge of integrating long-range temporal and relational context in heterogeneous graphs for multi-task prediction. It introduces a temporal subgraph sampler to capture temporally relevant relationships beyond immediate neighborhoods and a Relational Graph Perceiver (RGP) that uses a Perceiver-style cross-attention bottleneck to blend structural and temporal signals in a shared latent space, plus a flexible cross-attention decoder for tasks with disjoint label spaces. Key contributions include enabling scalable global context in relational graphs and unified multi-task learning within a single model, with empirical state-of-the-art results on RelBench, SALT, and CTU across binary, multi-class, and ranking tasks. The findings demonstrate both accuracy gains and compute efficiency, suggesting broad practical impact for relational deep learning in domains with rich temporal dynamics and diverse predictive objectives. This approach provides a foundation for scalable, multi-task relational models applicable to healthcare, finance, and e-commerce settings where temporal interactions drive decisions.

Abstract

In domains such as healthcare, finance, and e-commerce, the temporal dynamics of relational data emerge from complex interactions-such as those between patients and providers, or users and products across diverse categories. To be broadly useful, models operating on these data must integrate long-range spatial and temporal dependencies across diverse types of entities, while also supporting multiple predictive tasks. However, existing graph models for relational data primarily focus on spatial structure, treating temporal information merely as a filtering constraint to exclude future events rather than a modeling signal, and are typically designed for single-task prediction. To address these gaps, we introduce a temporal subgraph sampler that enhances global context by retrieving nodes beyond the immediate neighborhood to capture temporally relevant relationships. In addition, we propose the Relational Graph Perceiver (RGP), a graph transformer architecture for relational deep learning that leverages a cross-attention-based latent bottleneck to efficiently integrate information from both structural and temporal contexts. This latent bottleneck integrates signals from different node and edge types into a common latent space, enabling the model to build global context across the entire relational system. RGP also incorporates a flexible cross-attention decoder that supports joint learning across tasks with disjoint label spaces within a single model. Experiments on RelBench, SALT, and CTU show that RGP delivers state-of-the-art performance, offering a general and scalable solution for relational deep learning with support for diverse predictive tasks.

Paper Structure

This paper contains 20 sections, 4 equations, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 3: (A) Results from ablation study of RGP. We evaluate the impact of removing key components from the full RGP model. Performance is reported relative to the base model. A decrease in performance indicates that the removed component is important to overall model effectiveness. (B) Relative performance of multi-task vs. single-task training across datasets: the Y-axis shows the average multi-task performance normalized with respect to the average single-task performance across all tasks within each dataset.
  • Figure 4: Effect of number of latent tokens on model performance across four representative tasks. For each task, we normalize results with respect to the best-performing configuration to compute relative performance.
  • Figure 5: Relative performance of multi-task vs. single-task training across datasets: The Y-axis shows the average multi-task performance normalized with respect to the average single-task performance across all tasks within each dataset.