Table of Contents
Fetching ...

GNN Applied to Ego-nets for Friend Suggestions

Evgeny Zamyatin

TL;DR

This work tackles scalable link prediction in massive, heterogeneous social graphs with sparse node features by introducing a Generalized Ego-network Friendship Score framework that reduces full-graph prediction to ego-net–level tasks and aggregates results. It proposes WalkGNN, a second-order GNN over node pairs that uses WalkConv with per-edge $d\times d$ filters generated by EdgeMLP_k($\overline{e}$), updating pair-state matrices $W_k^{u,v}$ with $W_{k+1}^{u,v} = \dfrac{1}{d} \sum_{(t,v,\overline{e}) \in E} W_k^{u,t} \times EdgeMLP_k(\overline{e})$, and complexity $O(n^3 d^2)$ per layer. The framework is evaluated offline on Ego-VK and Yeast datasets and online in VK's PYMK, showing superior accuracy over baselines like Adamic-Adar, GIN/PPGN, and variants without edge attributes, and delivering a measurable business impact (e.g., a 12% CTR lift). The Ego-VK dataset provides a realistic, feature-sparse benchmark for supervised graph-level link prediction on ego-nets and supports future research into more sophisticated local aggregation and feature integration. Overall, the approach delivers scalable, improved friend recommendations and offers a new benchmark for topology-driven GNN expressivity in dynamic, heterogeneous graphs.

Abstract

A major problem of making friend suggestions in social networks is the large size of social graphs, which can have hundreds of millions of people and tens of billions of connections. Classic methods based on heuristics or factorizations are often used to address the difficulties of scaling more complex models. However, the unsupervised nature of these methods can lead to suboptimal results. In this work, we introduce the Generalized Ego-network Friendship Score framework, which makes it possible to use complex supervised models without sacrificing scalability. The main principle of the framework is to reduce the problem of link prediction on a full graph to a series of low-scale tasks on ego-nets with subsequent aggregation of their results. Here, the underlying model takes an ego-net as input and produces a pairwise relevance matrix for its nodes. In addition, we develop the WalkGNN model which is capable of working effectively in the social network domain, where these graph-level link prediction tasks are heterogeneous, dynamic and featureless. To measure the accuracy of this model, we introduce the Ego-VK dataset that serves as an exact representation of the real-world problem that we are addressing. Offline experiments on the dataset show that our model outperforms all baseline methods, and a live A/B test demonstrates the growth of business metrics as a result of utilizing our approach.

GNN Applied to Ego-nets for Friend Suggestions

TL;DR

This work tackles scalable link prediction in massive, heterogeneous social graphs with sparse node features by introducing a Generalized Ego-network Friendship Score framework that reduces full-graph prediction to ego-net–level tasks and aggregates results. It proposes WalkGNN, a second-order GNN over node pairs that uses WalkConv with per-edge filters generated by EdgeMLP_k(), updating pair-state matrices with , and complexity per layer. The framework is evaluated offline on Ego-VK and Yeast datasets and online in VK's PYMK, showing superior accuracy over baselines like Adamic-Adar, GIN/PPGN, and variants without edge attributes, and delivering a measurable business impact (e.g., a 12% CTR lift). The Ego-VK dataset provides a realistic, feature-sparse benchmark for supervised graph-level link prediction on ego-nets and supports future research into more sophisticated local aggregation and feature integration. Overall, the approach delivers scalable, improved friend recommendations and offers a new benchmark for topology-driven GNN expressivity in dynamic, heterogeneous graphs.

Abstract

A major problem of making friend suggestions in social networks is the large size of social graphs, which can have hundreds of millions of people and tens of billions of connections. Classic methods based on heuristics or factorizations are often used to address the difficulties of scaling more complex models. However, the unsupervised nature of these methods can lead to suboptimal results. In this work, we introduce the Generalized Ego-network Friendship Score framework, which makes it possible to use complex supervised models without sacrificing scalability. The main principle of the framework is to reduce the problem of link prediction on a full graph to a series of low-scale tasks on ego-nets with subsequent aggregation of their results. Here, the underlying model takes an ego-net as input and produces a pairwise relevance matrix for its nodes. In addition, we develop the WalkGNN model which is capable of working effectively in the social network domain, where these graph-level link prediction tasks are heterogeneous, dynamic and featureless. To measure the accuracy of this model, we introduce the Ego-VK dataset that serves as an exact representation of the real-world problem that we are addressing. Offline experiments on the dataset show that our model outperforms all baseline methods, and a live A/B test demonstrates the growth of business metrics as a result of utilizing our approach.

Paper Structure

This paper contains 10 sections, 1 equation, 2 figures, 3 tables, 3 algorithms.

Figures (2)

  • Figure 1: Ego-net example.
  • Figure 2: PYMK at VK.