Table of Contents
Fetching ...

Homophily modulates double descent generalization in graph convolution networks

Cheng Shi, Liming Pan, Hong Hu, Ivan Dokmanić

TL;DR

The work tackles generalization in transductive node classification with graph-structured data, where double descent can arise in GNNs. It develops a theoretical framework combining statistical mechanics and random matrix theory to analyze a simple GCN on the contextual stochastic block model (CSBM) and introduces a universality conjecture that replaces the binary SBM adjacency with Gaussian matrices. The analysis yields explicit generalization curves that depend on graph homophily/heterophily, graph and feature noise, and the training label ratio $ au = M/N$, and demonstrates that self-loops and their sign can mitigate double descent, including improvements on heterophilic graphs. The results provide both a conceptual explanation for observed phenomena and practical guidance for designing GCNs for heterophilic data, with implications for real-world datasets and beyond.

Abstract

Graph neural networks (GNNs) excel in modeling relational data such as biological, social, and transportation networks, but the underpinnings of their success are not well understood. Traditional complexity measures from statistical learning theory fail to account for observed phenomena like the double descent or the impact of relational semantics on generalization error. Motivated by experimental observations of ``transductive'' double descent in key networks and datasets, we use analytical tools from statistical physics and random matrix theory to precisely characterize generalization in simple graph convolution networks on the contextual stochastic block model. Our results illuminate the nuances of learning on homophilic versus heterophilic data and predict double descent whose existence in GNNs has been questioned by recent work. We show how risk is shaped by the interplay between the graph noise, feature noise, and the number of training labels. Our findings apply beyond stylized models, capturing qualitative trends in real-world GNNs and datasets. As a case in point, we use our analytic insights to improve performance of state-of-the-art graph convolution networks on heterophilic datasets.

Homophily modulates double descent generalization in graph convolution networks

TL;DR

The work tackles generalization in transductive node classification with graph-structured data, where double descent can arise in GNNs. It develops a theoretical framework combining statistical mechanics and random matrix theory to analyze a simple GCN on the contextual stochastic block model (CSBM) and introduces a universality conjecture that replaces the binary SBM adjacency with Gaussian matrices. The analysis yields explicit generalization curves that depend on graph homophily/heterophily, graph and feature noise, and the training label ratio , and demonstrates that self-loops and their sign can mitigate double descent, including improvements on heterophilic graphs. The results provide both a conceptual explanation for observed phenomena and practical guidance for designing GCNs for heterophilic data, with implications for real-world datasets and beyond.

Abstract

Graph neural networks (GNNs) excel in modeling relational data such as biological, social, and transportation networks, but the underpinnings of their success are not well understood. Traditional complexity measures from statistical learning theory fail to account for observed phenomena like the double descent or the impact of relational semantics on generalization error. Motivated by experimental observations of ``transductive'' double descent in key networks and datasets, we use analytical tools from statistical physics and random matrix theory to precisely characterize generalization in simple graph convolution networks on the contextual stochastic block model. Our results illuminate the nuances of learning on homophilic versus heterophilic data and predict double descent whose existence in GNNs has been questioned by recent work. We show how risk is shaped by the interplay between the graph noise, feature noise, and the number of training labels. Our findings apply beyond stylized models, capturing qualitative trends in real-world GNNs and datasets. As a case in point, we use our analytic insights to improve performance of state-of-the-art graph convolution networks on heterophilic datasets.
Paper Structure (1 section, 2 equations)

This paper contains 1 section, 2 equations.