Table of Contents
Fetching ...

Unifying Graph Convolutional Neural Networks and Label Propagation

Hongwei Wang, Jure Leskovec

TL;DR

This work investigates the theoretical relationship between Label Propagation and Graph Convolutional Networks from smoothing and influence perspectives, and introduces a unified end-to-end model (GCN-LPA) that learns edge weights guided by label information. By proving that feature smoothing implies label smoothing and linking Jacobian-based feature influence to label influence, the authors justify training edge weights to boost intra-class connectivity and label separation. The proposed framework can be implemented as a two-stage procedure (optimize A^* via LPA, then run GCN) or trained jointly with LPA regularization, effectively learning attention-like, task-oriented edge weights. Empirically, GCN-LPA demonstrates superior node classification accuracy across five real-world graphs and shows robustness to noisy edges while maintaining reasonable training efficiency, highlighting its practical impact for semi-supervised graph learning.

Abstract

Label Propagation (LPA) and Graph Convolutional Neural Networks (GCN) are both message passing algorithms on graphs. Both solve the task of node classification but LPA propagates node label information across the edges of the graph, while GCN propagates and transforms node feature information. However, while conceptually similar, theoretical relation between LPA and GCN has not yet been investigated. Here we study the relationship between LPA and GCN in terms of two aspects: (1) feature/label smoothing where we analyze how the feature/label of one node is spread over its neighbors; And, (2) feature/label influence of how much the initial feature/label of one node influences the final feature/label of another node. Based on our theoretical analysis, we propose an end-to-end model that unifies GCN and LPA for node classification. In our unified model, edge weights are learnable, and the LPA serves as regularization to assist the GCN in learning proper edge weights that lead to improved classification performance. Our model can also be seen as learning attention weights based on node labels, which is more task-oriented than existing feature-based attention models. In a number of experiments on real-world graphs, our model shows superiority over state-of-the-art GCN-based methods in terms of node classification accuracy.

Unifying Graph Convolutional Neural Networks and Label Propagation

TL;DR

This work investigates the theoretical relationship between Label Propagation and Graph Convolutional Networks from smoothing and influence perspectives, and introduces a unified end-to-end model (GCN-LPA) that learns edge weights guided by label information. By proving that feature smoothing implies label smoothing and linking Jacobian-based feature influence to label influence, the authors justify training edge weights to boost intra-class connectivity and label separation. The proposed framework can be implemented as a two-stage procedure (optimize A^* via LPA, then run GCN) or trained jointly with LPA regularization, effectively learning attention-like, task-oriented edge weights. Empirically, GCN-LPA demonstrates superior node classification accuracy across five real-world graphs and shows robustness to noisy edges while maintaining reasonable training efficiency, highlighting its practical impact for semi-supervised graph learning.

Abstract

Label Propagation (LPA) and Graph Convolutional Neural Networks (GCN) are both message passing algorithms on graphs. Both solve the task of node classification but LPA propagates node label information across the edges of the graph, while GCN propagates and transforms node feature information. However, while conceptually similar, theoretical relation between LPA and GCN has not yet been investigated. Here we study the relationship between LPA and GCN in terms of two aspects: (1) feature/label smoothing where we analyze how the feature/label of one node is spread over its neighbors; And, (2) feature/label influence of how much the initial feature/label of one node influences the final feature/label of another node. Based on our theoretical analysis, we propose an end-to-end model that unifies GCN and LPA for node classification. In our unified model, edge weights are learnable, and the LPA serves as regularization to assist the GCN in learning proper edge weights that lead to improved classification performance. Our model can also be seen as learning attention weights based on node labels, which is more task-oriented than existing feature-based attention models. In a number of experiments on real-world graphs, our model shows superiority over state-of-the-art GCN-based methods in terms of node classification accuracy.

Paper Structure

This paper contains 22 sections, 8 theorems, 37 equations, 7 figures, 4 tables.

Key Result

Theorem 1

(Relationship between feature smoothing and label smoothing) Suppose that the latent ground-truth mapping $\mathcal{M}: {\bf x} \rightarrow y$ from node features to node labels is differentiable and satisfies $L$-Lipschitz constraint, i.e., $| \mathcal{M}({\bf x}_1) - \mathcal{M}({\bf x}_2) | \leq L then the edge weights $\{a_{ij}\}$ also approximately smooth $y_i$ over its immediate neighbors wit

Figures (7)

  • Figure 1: A graph with two classes of nodes, while white nodes are unlabeled (Figure \ref{['fig:model_1']}). To ease the separation of the two classes, our model will increase the connecting strength among nodes within the same class (i.e., within one dotted circle), thereby increasing their feature/label influence on each other. In this way, our model is able to identify potential intra-class edges (bold links in Figure \ref{['fig:model_2']}) and strengthen their weights.
  • Figure 2: Node embeddings of Zachary's karate club network trained on a node classification task (red vs. blue). Figure \ref{['fig:karate_1']} visualizes the graph. Node coordinates in Figure \ref{['fig:karate_2']}-\ref{['fig:karate_5']} are the embedding coordinates. Notice that GCN does not produce linearly separable embeddings (Figure \ref{['fig:karate_2']} vs. Figure \ref{['fig:karate_3']}), while GCN-LPA performs much better even in the presence of noisy edges (Figure \ref{['fig:karate_4']} vs. Figure \ref{['fig:karate_5']}). Additional visualizations are included in Appendix \ref{['app:e']}.
  • Figure 3: Sensitivity to # LPA iterations on Citeseer dataset.
  • Figure 4: Sensitivity to $\lambda$ on Citeseer dataset.
  • Figure 5: Training time per epoch on random graphs.
  • ...and 2 more figures

Theorems & Definitions (10)

  • Theorem 1
  • Definition 1
  • Definition 2
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Lemma 4