Table of Contents
Fetching ...

Can LLMs Predict Academic Collaboration? Topology Heuristics vs. LLM-Based Link Prediction on Real Co-authorship Networks

Fan Huang, Munjung Kim

Abstract

Can large language models (LLMs) predict which researchers will collaborate? We study this question through link prediction on real-world co-authorship networks from OpenAlex (9.96M authors, 108.7M edges), evaluating whether LLMs can predict future scientific collaborations using only author profiles, without access to graph structure. Using Qwen2.5-72B-Instruct across three historical eras of AI research, we find that LLMs and topology heuristics capture distinct signals and are strongest in complementary settings. On new-edge prediction under natural class imbalance, the LLM achieves AUROC 0.714--0.789, outperforming Common Neighbors, Jaccard, and Preferential Attachment, with recall up to 92.9\%; under balanced evaluation, the LLM outperforms \emph{all} topology heuristics in every era (AUROC 0.601--0.658 vs.\ best-heuristic 0.525--0.538); on continued edges, the LLM (0.687) is competitive with Adamic-Adar (0.684). Critically, 78.6--82.7\% of new collaborations occur between authors with no common neighbor -- a blind spot where all topology heuristics score zero but the LLM still achieves AUROC 0.652 by reasoning from author metadata alone. A temporal metadata ablation reveals that research concepts are the dominant signal (removing concepts drops AUROC by 0.047--0.084). Providing pre-computed graph features to the LLM \emph{degrades} performance due to anchoring effects, confirming that LLMs and topology methods should operate as separate, complementary channels. A socio-cultural ablation finds that name-inferred ethnicity and institutional country do not predict collaboration beyond topology, reflecting the demographic homogeneity of AI research. A node2vec baseline achieves AUROC comparable to Adamic-Adar, establishing that LLMs access a fundamentally different information channel -- author metadata -- rather than encoding the same structural signal differently.

Can LLMs Predict Academic Collaboration? Topology Heuristics vs. LLM-Based Link Prediction on Real Co-authorship Networks

Abstract

Can large language models (LLMs) predict which researchers will collaborate? We study this question through link prediction on real-world co-authorship networks from OpenAlex (9.96M authors, 108.7M edges), evaluating whether LLMs can predict future scientific collaborations using only author profiles, without access to graph structure. Using Qwen2.5-72B-Instruct across three historical eras of AI research, we find that LLMs and topology heuristics capture distinct signals and are strongest in complementary settings. On new-edge prediction under natural class imbalance, the LLM achieves AUROC 0.714--0.789, outperforming Common Neighbors, Jaccard, and Preferential Attachment, with recall up to 92.9\%; under balanced evaluation, the LLM outperforms \emph{all} topology heuristics in every era (AUROC 0.601--0.658 vs.\ best-heuristic 0.525--0.538); on continued edges, the LLM (0.687) is competitive with Adamic-Adar (0.684). Critically, 78.6--82.7\% of new collaborations occur between authors with no common neighbor -- a blind spot where all topology heuristics score zero but the LLM still achieves AUROC 0.652 by reasoning from author metadata alone. A temporal metadata ablation reveals that research concepts are the dominant signal (removing concepts drops AUROC by 0.047--0.084). Providing pre-computed graph features to the LLM \emph{degrades} performance due to anchoring effects, confirming that LLMs and topology methods should operate as separate, complementary channels. A socio-cultural ablation finds that name-inferred ethnicity and institutional country do not predict collaboration beyond topology, reflecting the demographic homogeneity of AI research. A node2vec baseline achieves AUROC comparable to Adamic-Adar, establishing that LLMs access a fundamentally different information channel -- author metadata -- rather than encoding the same structural signal differently.

Paper Structure

This paper contains 43 sections, 20 figures, 15 tables.

Figures (20)

  • Figure 1: Ground truth construction pipeline. Three stages progressively narrow the OpenAlex corpus (249.1M works) to AI co-authorship edges, split them across three historical eras with train-eval temporal windows, detect communities via Louvain, and select the top-1 (largest) community per era for evaluation.
  • Figure 2: Empirical basis for the three-era temporal split. (a) Network scale: edges (bars) and authors (line). (b) Edge growth rate: the 2.61$\times$ spike marks Boundary 1; deceleration to 0.99$\times$ marks Boundary 2. (c) Average degree peaks then declines. (d) Edges-per-author peak then drop.
  • Figure 3: Two prediction methods. (a) Topology heuristics: five structure-based scoring functions applied to candidate pairs. (b) LLM confirmation: author profiles provided to Qwen2.5-72B for binary link prediction.
  • Figure 4: Experimental sample networks for each era (5,000 stratified candidate pairs). Nodes colored by sub-community; red edges indicate true future collaborations (positives). Only 47 (Era 1), 14 (Era 2), and 7 (Era 3) of 5,000 pairs are positives, illustrating the extreme class imbalance.
  • Figure 5: 68 positive pairs plotted by AA score vs. LLM probability, colored by agreement category. The LLM catches 40% of positives that AA misses (red squares, lower-left quadrant), while only 1 pair is caught by AA alone. Dashed/dotted lines show Era 1 AA threshold (0.48) and LLM threshold (0.5).
  • ...and 15 more figures