Table of Contents
Fetching ...

ContextGNN: Beyond Two-Tower Recommendation Systems

Yiwen Yuan, Zecheng Zhang, Xinwei He, Akihiro Nitta, Weihua Hu, Dong Wang, Manan Shah, Shenyang Huang, Blaž Stojanovič, Alan Krumholz, Jan Eric Lenssen, Jure Leskovec, Matthias Fey

TL;DR

ContextGNN presents a hybrid graph neural network architecture for temporal, heterogeneous recommendations that fuses pair-wise representations in a user’s local subgraph with a lightweight two-tower model for distant items. The model learns a user-specific fusion to adaptively weight local versus distant signals, enabling effective recommendations for both familiar and exploratory items. Empirical results on RelBench demonstrate significant gains over both purely pair-wise and purely two-tower baselines, with notable robustness to locality variations, and the approach provides scalable training via a sampled-softmax objective. The work advances relational deep learning by offering a unified, end-to-end framework that integrates local, context-rich patterns with global candidate coverage for real-world, multi-relational recommendation tasks.

Abstract

Recommendation systems predominantly utilize two-tower architectures, which evaluate user-item rankings through the inner product of their respective embeddings. However, one key limitation of two-tower models is that they learn a pair-agnostic representation of users and items. In contrast, pair-wise representations either scale poorly due to their quadratic complexity or are too restrictive on the candidate pairs to rank. To address these issues, we introduce Context-based Graph Neural Networks (ContextGNNs), a novel deep learning architecture for link prediction in recommendation systems. The method employs a pair-wise representation technique for familiar items situated within a user's local subgraph, while leveraging two-tower representations to facilitate the recommendation of exploratory items. A final network then predicts how to fuse both pair-wise and two-tower recommendations into a single ranking of items. We demonstrate that ContextGNN is able to adapt to different data characteristics and outperforms existing methods, both traditional and GNN-based, on a diverse set of practical recommendation tasks, improving performance by 20% on average.

ContextGNN: Beyond Two-Tower Recommendation Systems

TL;DR

ContextGNN presents a hybrid graph neural network architecture for temporal, heterogeneous recommendations that fuses pair-wise representations in a user’s local subgraph with a lightweight two-tower model for distant items. The model learns a user-specific fusion to adaptively weight local versus distant signals, enabling effective recommendations for both familiar and exploratory items. Empirical results on RelBench demonstrate significant gains over both purely pair-wise and purely two-tower baselines, with notable robustness to locality variations, and the approach provides scalable training via a sampled-softmax objective. The work advances relational deep learning by offering a unified, end-to-end framework that integrates local, context-rich patterns with global candidate coverage for real-world, multi-relational recommendation tasks.

Abstract

Recommendation systems predominantly utilize two-tower architectures, which evaluate user-item rankings through the inner product of their respective embeddings. However, one key limitation of two-tower models is that they learn a pair-agnostic representation of users and items. In contrast, pair-wise representations either scale poorly due to their quadratic complexity or are too restrictive on the candidate pairs to rank. To address these issues, we introduce Context-based Graph Neural Networks (ContextGNNs), a novel deep learning architecture for link prediction in recommendation systems. The method employs a pair-wise representation technique for familiar items situated within a user's local subgraph, while leveraging two-tower representations to facilitate the recommendation of exploratory items. A final network then predicts how to fuse both pair-wise and two-tower recommendations into a single ranking of items. We demonstrate that ContextGNN is able to adapt to different data characteristics and outperforms existing methods, both traditional and GNN-based, on a diverse set of practical recommendation tasks, improving performance by 20% on average.

Paper Structure

This paper contains 23 sections, 2 equations, 1 figure, 5 tables.

Figures (1)

  • Figure 1: Overview of Context-based Graph Neural Networks.ContextGNN utilizes a bidirectional $\mathrm{GNN}_{\bm{\theta}}$ to learn user $\bm{h}^{(2)}_v$ and user-specific item representations $\bm{h}^{(2)}_w$ within a user's local subgraph. Its message passing scheme is enhanced by additionally propagating shallow item embeddings $\bm{w}_w$ and seed user $\textsc{Identicator}_{\bm{\theta}}$ representations. Afterwards, item scores are produced depending on whether an item situates within a user's subgraph. A user-specific fusion score is learned via an $\mathrm{MLP}_{\bm{\theta}}$ to produce the final ranking by offsetting the contributions of local rankings.