Table of Contents
Fetching ...

The Heterophilic Graph Learning Handbook: Benchmarks, Models, Theoretical Analysis, Applications and Challenges

Sitao Luan, Chenqing Hua, Qincheng Lu, Liheng Ma, Lirong Wu, Xinyu Wang, Minkai Xu, Xiao-Wen Chang, Doina Precup, Rex Ying, Stan Z. Li, Jian Tang, Guy Wolf, Stefanie Jegelka

TL;DR

The Heterophilic Graph Learning Handbook confronts the conventional assumption that homophily underpins GNN success by surveying heterogeneous and homogeneous graphs under heterophily, introducing benchmarks, metrics, theoretical insights, and broad applications. It classifies datasets into benign, malignant, and ambiguous heterophily, provides a comprehensive taxonomy of models (from high-pass/low-pass filtering to graph transformers and HetGNNs), and details both supervised and unsupervised learning approaches tailored to heterophilic structures. The handbook also synthesizes theoretical findings on mid-homophily pitfalls, distribution shifts, and relaxation of traditional smoothing assumptions, linking these to practical performance and generalization. Finally, it maps a wide landscape of heterophily-related applications and outlines challenges and future directions, including temporal graphs, hypergraphs, fairness, and the emergence of graph foundation models, urging rigorous benchmarking and principled graph-aware design for real-world tasks.

Abstract

Homophily principle, \ie{} nodes with the same labels or similar attributes are more likely to be connected, has been commonly believed to be the main reason for the superiority of Graph Neural Networks (GNNs) over traditional Neural Networks (NNs) on graph-structured data, especially on node-level tasks. However, recent work has identified a non-trivial set of datasets where GNN's performance compared to the NN's is not satisfactory. Heterophily, i.e. low homophily, has been considered the main cause of this empirical observation. People have begun to revisit and re-evaluate most existing graph models, including graph transformer and its variants, in the heterophily scenario across various kinds of graphs, e.g. heterogeneous graphs, temporal graphs and hypergraphs. Moreover, numerous graph-related applications are found to be closely related to the heterophily problem. In the past few years, considerable effort has been devoted to studying and addressing the heterophily issue. In this survey, we provide a comprehensive review of the latest progress on heterophilic graph learning, including an extensive summary of benchmark datasets and evaluation of homophily metrics on synthetic graphs, meticulous classification of the most updated supervised and unsupervised learning methods, thorough digestion of the theoretical analysis on homophily/heterophily, and broad exploration of the heterophily-related applications. Notably, through detailed experiments, we are the first to categorize benchmark heterophilic datasets into three sub-categories: malignant, benign and ambiguous heterophily. Malignant and ambiguous datasets are identified as the real challenging datasets to test the effectiveness of new models on the heterophily challenge. Finally, we propose several challenges and future directions for heterophilic graph representation learning.

The Heterophilic Graph Learning Handbook: Benchmarks, Models, Theoretical Analysis, Applications and Challenges

TL;DR

The Heterophilic Graph Learning Handbook confronts the conventional assumption that homophily underpins GNN success by surveying heterogeneous and homogeneous graphs under heterophily, introducing benchmarks, metrics, theoretical insights, and broad applications. It classifies datasets into benign, malignant, and ambiguous heterophily, provides a comprehensive taxonomy of models (from high-pass/low-pass filtering to graph transformers and HetGNNs), and details both supervised and unsupervised learning approaches tailored to heterophilic structures. The handbook also synthesizes theoretical findings on mid-homophily pitfalls, distribution shifts, and relaxation of traditional smoothing assumptions, linking these to practical performance and generalization. Finally, it maps a wide landscape of heterophily-related applications and outlines challenges and future directions, including temporal graphs, hypergraphs, fairness, and the emergence of graph foundation models, urging rigorous benchmarking and principled graph-aware design for real-world tasks.

Abstract

Homophily principle, \ie{} nodes with the same labels or similar attributes are more likely to be connected, has been commonly believed to be the main reason for the superiority of Graph Neural Networks (GNNs) over traditional Neural Networks (NNs) on graph-structured data, especially on node-level tasks. However, recent work has identified a non-trivial set of datasets where GNN's performance compared to the NN's is not satisfactory. Heterophily, i.e. low homophily, has been considered the main cause of this empirical observation. People have begun to revisit and re-evaluate most existing graph models, including graph transformer and its variants, in the heterophily scenario across various kinds of graphs, e.g. heterogeneous graphs, temporal graphs and hypergraphs. Moreover, numerous graph-related applications are found to be closely related to the heterophily problem. In the past few years, considerable effort has been devoted to studying and addressing the heterophily issue. In this survey, we provide a comprehensive review of the latest progress on heterophilic graph learning, including an extensive summary of benchmark datasets and evaluation of homophily metrics on synthetic graphs, meticulous classification of the most updated supervised and unsupervised learning methods, thorough digestion of the theoretical analysis on homophily/heterophily, and broad exploration of the heterophily-related applications. Notably, through detailed experiments, we are the first to categorize benchmark heterophilic datasets into three sub-categories: malignant, benign and ambiguous heterophily. Malignant and ambiguous datasets are identified as the real challenging datasets to test the effectiveness of new models on the heterophily challenge. Finally, we propose several challenges and future directions for heterophilic graph representation learning.
Paper Structure (189 sections, 16 equations, 6 figures, 3 tables)

This paper contains 189 sections, 16 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Example of feature aggregation in homophilic and heterophilic graphs. There are $3$ classes of nodes, and each color indicates different node features.
  • Figure 2: Comparison of metrics on synthetic graphs with different generation methods. Figure (a)(b) are from luan2024addressing.
  • Figure 3: An overview of graph models for homogeneous graphs with heterophily.
  • Figure 4: An overview of unsupervised learning of graph models on heterophilic graphs. We only include a small portion of unsupervised learning graph models for heterophily in this diagram, more models and details can be found in each section.
  • Figure 5: An overview of theoretical understanding of graph homophily and heterophily.
  • ...and 1 more figures