Table of Contents
Fetching ...

RankMerging: A supervised learning-to-rank framework to predict links in large social network

Lionel Tabourier, Daniel Faria Bernardes, Anne-Sophie Libert, Renaud Lambiotte

TL;DR

RankMerging addresses the challenge of predicting missing or future links in large, sparse social networks by proposing a supervised learning-to-rank framework that aggregates multiple unsupervised rankings. It uses a greedy, window-based optimization to maximize the number of true positives in the top-$\theta$ predictions, learning mixing coefficients from a training graph and applying them to a test graph with a scaling factor. Across four large datasets (PSP, DBLP, Pokec, Facebook) and multiple baselines, RankMerging consistently improves precision-recall performance over unsupervised aggregations and competitive supervised methods, while remaining computationally efficient with complexity $O(\alpha \cdot \theta)$ where $\alpha$ is the number of rankings and $\theta$ the number of predictions. The approach is versatile, capable of incorporating various features and scalable to very large networks, and offers practical benefits for applications like churn prediction, security, and biomedical discovery.

Abstract

Uncovering unknown or missing links in social networks is a difficult task because of their sparsity and because links may represent different types of relationships, characterized by different structural patterns. In this paper, we define a simple yet efficient supervised learning-to-rank framework, called RankMerging, which aims at combining information provided by various unsupervised rankings. We illustrate our method on three different kinds of social networks and show that it substantially improves the performances of unsupervised metrics of ranking. We also compare it to other combination strategies based on standard methods. Finally, we explore various aspects of RankMerging, such as feature selection and parameter estimation and discuss its area of relevance: the prediction of an adjustable number of links on large networks.

RankMerging: A supervised learning-to-rank framework to predict links in large social network

TL;DR

RankMerging addresses the challenge of predicting missing or future links in large, sparse social networks by proposing a supervised learning-to-rank framework that aggregates multiple unsupervised rankings. It uses a greedy, window-based optimization to maximize the number of true positives in the top- predictions, learning mixing coefficients from a training graph and applying them to a test graph with a scaling factor. Across four large datasets (PSP, DBLP, Pokec, Facebook) and multiple baselines, RankMerging consistently improves precision-recall performance over unsupervised aggregations and competitive supervised methods, while remaining computationally efficient with complexity where is the number of rankings and the number of predictions. The approach is versatile, capable of incorporating various features and scalable to very large networks, and offers practical benefits for applications like churn prediction, security, and biomedical discovery.

Abstract

Uncovering unknown or missing links in social networks is a difficult task because of their sparsity and because links may represent different types of relationships, characterized by different structural patterns. In this paper, we define a simple yet efficient supervised learning-to-rank framework, called RankMerging, which aims at combining information provided by various unsupervised rankings. We illustrate our method on three different kinds of social networks and show that it substantially improves the performances of unsupervised metrics of ranking. We also compare it to other combination strategies based on standard methods. Finally, we explore various aspects of RankMerging, such as feature selection and parameter estimation and discuss its area of relevance: the prediction of an adjustable number of links on large networks.

Paper Structure

This paper contains 44 sections, 2 equations, 9 figures, 5 tables, 2 algorithms.

Figures (9)

  • Figure 1: Schematic representation of the scaling factor $f$, ratio of the number of pairs ranked in the test set over the number of pairs ranked in the training set. On this example, the number of items ranked in the test set is supposed to be twice as large as the number of items ranked in the learning set, so that $f=2$.
  • Figure 2: RankMerging method: learning algorithm.
  • Figure 3: RankMerging method: testing algorithm.
  • Figure 4: Function $\frac{\chi_i}{g}$ corresponding to ranking $\textit{CN}_w$ for $g=100,1000,5000$ during a learning process on the PSP dataset.
  • Figure 5: Results obtained on the learning set for various structural classifiers. Left: F1-score as a function of the number of predictions. Right: precision versus recall curves.
  • ...and 4 more figures