Language Model-Enhanced Message Passing for Heterophilic Graph Learning

Wenjun Wang; Dawei Cheng

Language Model-Enhanced Message Passing for Heterophilic Graph Learning

Wenjun Wang, Dawei Cheng

TL;DR

LEMP4HG tackles heterophilic graph learning by integrating LM-generated connection analyses with SLM-encoded node texts to produce semantically rich messages. It introduces a gating-based fusion of LM messages and node embeddings, and a Modulated Variation of Reliable Distance (MVRD) to drive selective, budget-bounded LM querying. An active-learning component selects the most informative edges to enhance, reducing cost and mitigating interference on homophilic regions. Across 16 real-world text-attributed graphs, LEMP4HG demonstrates robust gains on heterophilic cases and stable performance on homophilic ones, providing practical budget guidelines and insights into LM-assisted graph propagation.

Abstract

Traditional graph neural networks (GNNs), which rely on homophily-driven message passing, struggle with heterophilic graphs where connected nodes exhibit dissimilar features and different labels. While existing methods address heterophily through graph structure refinement or adaptation of neighbor aggregation functions, they often overlook the semantic potential of node text, rely on suboptimal message representation for propagation and compromise performance on homophilic graphs. To address these limitations, we propose a novel language model (LM)-enhanced message passing approach for heterophilic graph leaning (LEMP4HG). Specifically, in the context of text-attributed graph, we provide paired node texts for LM to generate their connection analysis, which are encoded and then fused with paired node textual embeddings through a gating mechanism. The synthesized messages are semantically enriched and adaptively balanced with both nodes' information, which mitigates contradictory signals when neighbor aggregation in heterophilic regions. Furthermore, we introduce an active learning strategy guided by our heuristic MVRD (Modulated Variation of Reliable Distance), selectively enhancing node pairs suffer most from message passing, reducing the cost of analysis generation and side effects on homophilic regions. Extensive experiments validate that our approach excels on heterophilic graphs and performs robustly on homophilic ones, with a graph convolutional network (GCN) backbone and a practical budget.

Language Model-Enhanced Message Passing for Heterophilic Graph Learning

TL;DR

Abstract

Language Model-Enhanced Message Passing for Heterophilic Graph Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (2)