Table of Contents
Fetching ...

UniGAP: A Universal and Adaptive Graph Upsampling Approach to Mitigate Over-Smoothing in Node Classification Tasks

Xiaotang Wang, Yun Zhu, Haizhou Shi, Yongchao Liu, Yongqi Zhang

TL;DR

UniGAP introduces a universal, adaptive graph upsampling framework to mitigate over-smoothing in node classification by inserting intermediate nodes on edges based on trajectory-derived features. It comprises trajectory computation, a multi-view condensation encoder, and a differentiable adaptive upsampler, all trained end-to-end with a smoothing regularizer; its modular design enables plug-in integration with diverse GNNs. Theoretical analysis under linear mean aggregation suggests UniGAP slows smoothing from a rate of $A^{2k}$ to about $A^{k-1}$, while empirical results across homophilic and heterophilic datasets demonstrate consistent improvements, especially in challenging heterophilic graphs, and scalability remains comparable to standard GNNs. The work also shows promise when combined with large language models on text-attributed graphs, highlighting potential cross-domain benefits and future extensions to other graph tasks.

Abstract

In the graph domain, deep graph networks based on Message Passing Neural Networks (MPNNs) or Graph Transformers often cause over-smoothing of node features, limiting their expressive capacity. Many upsampling techniques involving node and edge manipulation have been proposed to mitigate this issue. However, these methods are often heuristic, resulting in extensive manual labor and suboptimal performance and lacking a universal integration strategy. In this study, we introduce UniGAP, a universal and adaptive graph upsampling framework to mitigate over-smoothing in node classification tasks. Specifically, we design an adaptive graph upsampler based on condensed trajectory features, serving as a plug-in component for existing GNNs to mitigate the over-smoothing problem and enhance performance. Moreover, UniGAP serves as a representation-based and fully differentiable framework to inspire further exploration of graph upsampling methods. Through extensive experiments, UniGAP demonstrates significant improvements over heuristic data augmentation methods in various datasets and metrics. We analyze how graph structure evolves with UniGAP, identifying key bottlenecks where over-smoothing occurs, and providing insights into how UniGAP addresses this issue. Lastly, we show the potential of combining UniGAP with large language models (LLMs) to further improve downstream performance. Our code is available at: https://github.com/wangxiaotang0906/UniGAP

UniGAP: A Universal and Adaptive Graph Upsampling Approach to Mitigate Over-Smoothing in Node Classification Tasks

TL;DR

UniGAP introduces a universal, adaptive graph upsampling framework to mitigate over-smoothing in node classification by inserting intermediate nodes on edges based on trajectory-derived features. It comprises trajectory computation, a multi-view condensation encoder, and a differentiable adaptive upsampler, all trained end-to-end with a smoothing regularizer; its modular design enables plug-in integration with diverse GNNs. Theoretical analysis under linear mean aggregation suggests UniGAP slows smoothing from a rate of to about , while empirical results across homophilic and heterophilic datasets demonstrate consistent improvements, especially in challenging heterophilic graphs, and scalability remains comparable to standard GNNs. The work also shows promise when combined with large language models on text-attributed graphs, highlighting potential cross-domain benefits and future extensions to other graph tasks.

Abstract

In the graph domain, deep graph networks based on Message Passing Neural Networks (MPNNs) or Graph Transformers often cause over-smoothing of node features, limiting their expressive capacity. Many upsampling techniques involving node and edge manipulation have been proposed to mitigate this issue. However, these methods are often heuristic, resulting in extensive manual labor and suboptimal performance and lacking a universal integration strategy. In this study, we introduce UniGAP, a universal and adaptive graph upsampling framework to mitigate over-smoothing in node classification tasks. Specifically, we design an adaptive graph upsampler based on condensed trajectory features, serving as a plug-in component for existing GNNs to mitigate the over-smoothing problem and enhance performance. Moreover, UniGAP serves as a representation-based and fully differentiable framework to inspire further exploration of graph upsampling methods. Through extensive experiments, UniGAP demonstrates significant improvements over heuristic data augmentation methods in various datasets and metrics. We analyze how graph structure evolves with UniGAP, identifying key bottlenecks where over-smoothing occurs, and providing insights into how UniGAP addresses this issue. Lastly, we show the potential of combining UniGAP with large language models (LLMs) to further improve downstream performance. Our code is available at: https://github.com/wangxiaotang0906/UniGAP
Paper Structure (33 sections, 4 theorems, 16 equations, 8 figures, 10 tables, 1 algorithm)

This paper contains 33 sections, 4 theorems, 16 equations, 8 figures, 10 tables, 1 algorithm.

Key Result

Proposition 4.1

By jointly training the parameters of the downstream model $\theta$ and UniGAP $\theta_{u}$ at each epoch, UniGAP learns the optimal structural representation for downstream tasks by optimizing $\langle\hat{\theta}, \hat{\theta}_{u}\rangle$ as follows:

Figures (8)

  • Figure 1: Overview of the UniGAP framework. (1) Given an input graph, the Trajectory Precomputation module first computes layer-wise node trajectories that capture how node representations evolve with depth. (2) These trajectories are fed into the Multi-View Condensation (MVC) Encoder, which aggregates multi-hop information into a compact, per-node representation. (3) The Adaptive Upsampler then uses these condensed features to predict where to insert intermediate nodes on edges, and a differentiable sampling step produces an augmented graph with updated topology and node features. (4) Finally, the augmented graph is passed to a downstream GNN for tasks such as node classification, and the task loss is backpropagated to jointly update the MVC encoder, the upsampler, and the downstream model. After a warm-up epoch, the refined downstream GNN is reused to recompute trajectories, and the process is iterated until convergence.
  • Figure 2: Instantiations of MVC Encoder. The left one is Trajectory-MLP-Mixer, the circle size denotes the weights of each hop information. The right denotes Trajectory-Transformer which treats each hop feature as input token.
  • Figure 3: The accuracy and MAD values of various upsampling methods in different layers on the Cora dataset.
  • Figure 4: The proportion of inserted nodes for intra-class and inter-class edges by UniGAP of the optimal graph.
  • Figure 5: The performance of UniGAP with different Trajectory precomputation strategies (pink color), with different MVC Encoder (blue color), and with different initialization for inserted nodes (green color).
  • ...and 3 more figures

Theorems & Definitions (5)

  • Proposition 4.1
  • Definition 1
  • Lemma 4.2
  • Lemma 4.3
  • Theorem 4.4: Slower Smoothing of UniGAP