Table of Contents
Fetching ...

Diffusion Signals Reveal Hidden Connections: A Physics-Inspired Framework for Link Prediction via Personalized PageRank Signals

Huilin Wang Wenjun Zhang Weibing Deng

TL;DR

This work addresses the challenge of accurate, scalable link prediction in complex networks by introducing a physics-inspired framework that blends static topology with dynamical information flow. It defines D-PPR, which uses Personalized PageRank signals as multi-scale structural fingerprints and measures their diffusion on the graph via the Laplacian to produce a diffusion-based similarity score. The key contributions are the formalization of diffusion distance on PPR signals, the reciprocal scoring scheme, and extensive validation on Barabási-Albert, LFR, and seven real networks, where D-PPR often outperforms traditional local and global baselines, especially in sparse or modular graphs. The approach deepens understanding of the topology-dynamics interplay in networks and offers a robust, dynamics-aware tool for link prediction with potential extensions to directed, weighted, and temporal networks and links to graph representation learning.

Abstract

Link prediction in complex networks--identifying the missing or future connections--remains a cornerstone problem for understanding network evolution and function, yet existing methods struggle to balance computational efficiency with theoretical rigor across heterogeneous topologies. This work introduces a physically principled framework, Diffusion Distance with Personalized PageRank (D-PPR), which unifies static topology with dynamic information flow by modeling nodes as signal sources propagating through the network via Personalized PageRank (PPR) vectors. The method quantifies node-pair similarity through the graph Laplacian-governed diffusion distance between their topology-aware signal distributions, thereby bridging microscopic interactions with macroscopic network dynamics. Systematic benchmarking on synthetic (Barabási-Albert, LFR) and seven large-scale real-world networks spanning technology, biology, and social domains demonstrates that D-PPR achieves highly competitive performance, yielding favorable results when compared to representative local and global heuristics, particularly in sparse and modular networks. These findings establish a rigorous foundation for physics-inspired link prediction by revealing that incorporating dynamical processes into structural similarity metrics enables deeper insights into network connectivity patterns, offering both methodological advances and new theoretical perspectives on the interplay between topology and dynamics.

Diffusion Signals Reveal Hidden Connections: A Physics-Inspired Framework for Link Prediction via Personalized PageRank Signals

TL;DR

This work addresses the challenge of accurate, scalable link prediction in complex networks by introducing a physics-inspired framework that blends static topology with dynamical information flow. It defines D-PPR, which uses Personalized PageRank signals as multi-scale structural fingerprints and measures their diffusion on the graph via the Laplacian to produce a diffusion-based similarity score. The key contributions are the formalization of diffusion distance on PPR signals, the reciprocal scoring scheme, and extensive validation on Barabási-Albert, LFR, and seven real networks, where D-PPR often outperforms traditional local and global baselines, especially in sparse or modular graphs. The approach deepens understanding of the topology-dynamics interplay in networks and offers a robust, dynamics-aware tool for link prediction with potential extensions to directed, weighted, and temporal networks and links to graph representation learning.

Abstract

Link prediction in complex networks--identifying the missing or future connections--remains a cornerstone problem for understanding network evolution and function, yet existing methods struggle to balance computational efficiency with theoretical rigor across heterogeneous topologies. This work introduces a physically principled framework, Diffusion Distance with Personalized PageRank (D-PPR), which unifies static topology with dynamic information flow by modeling nodes as signal sources propagating through the network via Personalized PageRank (PPR) vectors. The method quantifies node-pair similarity through the graph Laplacian-governed diffusion distance between their topology-aware signal distributions, thereby bridging microscopic interactions with macroscopic network dynamics. Systematic benchmarking on synthetic (Barabási-Albert, LFR) and seven large-scale real-world networks spanning technology, biology, and social domains demonstrates that D-PPR achieves highly competitive performance, yielding favorable results when compared to representative local and global heuristics, particularly in sparse and modular networks. These findings establish a rigorous foundation for physics-inspired link prediction by revealing that incorporating dynamical processes into structural similarity metrics enables deeper insights into network connectivity patterns, offering both methodological advances and new theoretical perspectives on the interplay between topology and dynamics.

Paper Structure

This paper contains 11 sections, 5 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Visualization of the D-PPR Method's Core Intuition. The panels (Series A & B) visualize signals diffusing from two distinct nodes over time. Our method quantifies the similarity of these diffusion patterns using a diffusion distance. This distance, applied to Personalized PageRank signals, is then used to compute a final link score.
  • Figure 2: Performance of link prediction methods as a function of network density on Barabási-Albert (BA) model networks. The plot illustrates how the AUPR score for each method changes as the average degree $<k>$ of the network increases. Networks were synthetically generated using the BA model with $N=500$ nodes. The results indicate that the performance of global methods, particularly our proposed D-PPR, improves substantially with increasing density. This suggests that these methods effectively leverage the greater abundance of paths available in denser graphs.
  • Figure 3: Robustness of link prediction methods against varying community structure in LFR benchmark networks. This figure plots the AUPR of each method as the community mixing parameter $\mu$ increases, where a higher $\mu$ signifies weaker and less distinct community structures. Networks were generated using the LFR benchmark model with $N=250$ nodes. While the performance of all methods declines as community structure weakens, the global heuristics (D-PPR and Katz Index) exhibit greater resilience compared to local methods, highlighting D-PPR's ability to identify structural similarities even when strong modularity is absent.
  • Figure 4: Comparative performance of link prediction methods on Zachary's Karate Club network. The bar chart displays the AUPR score for the proposed D-PPR framework against three baseline methods. The evaluation was performed on this classic social network benchmark by randomly hiding 20% of the edges for testing. The results demonstrate that D-PPR significantly outperforms all other heuristics. This superior performance is attributed to its ability to effectively capture the network's well-defined community structure, which is the primary driver of its topology.
  • Figure 5: Performance evaluation on diverse large-scale real-world networks. This figure presents a comparative analysis of AUPR scores for D-PPR and baseline methods across seven real-world networks from technological, communication, biological, and transportation domains (see Table \ref{['tab:datasets']} for network details). The results confirm the robust and broadly applicable performance of the D-PPR framework. Notably, it achieves the highest AUPR on several complex networks, such as Email and WWW, demonstrating its effectiveness and generalizability to diverse real-world systems.