Unsupervised Neighborhood Propagation Kernel Layers for Semi-supervised Node Classification

Sonny Achten; Francesco Tonin; Panagiotis Patrinos; Johan A. K. Suykens

Unsupervised Neighborhood Propagation Kernel Layers for Semi-supervised Node Classification

Sonny Achten, Francesco Tonin, Panagiotis Patrinos, Johan A. K. Suykens

TL;DR

Unsupervised Neighborhood Propagation Kernel Layers for Semi-supervised Node Classification introduces GCKM, a deep architecture that stacks unsupervised kernel machines for one-hop neighborhood propagation with a final semi-supervised kernel machine readout. It operates in the dual space via conjugate feature duality, solving a Kernel PCA-based objective in each layer and a Kernel Spectral Clustering readout for labels, enabling end-to-end training with a two-step initialization and finetuning. The approach yields competitive or superior performance in low-label regimes across homophilious and heterophilious graphs, aided by an unsupervised validation metric for model selection. The work demonstrates the value of unsupervised cores to boost semi-supervised node classification and suggests extensions to inductive tasks and scalable sparse implementations.

Abstract

We present a deep Graph Convolutional Kernel Machine (GCKM) for semi-supervised node classification in graphs. The method is built of two main types of blocks: (i) We introduce unsupervised kernel machine layers propagating the node features in a one-hop neighborhood, using implicit node feature mappings. (ii) We specify a semi-supervised classification kernel machine through the lens of the Fenchel-Young inequality. We derive an effective initialization scheme and efficient end-to-end training algorithm in the dual variables for the full architecture. The main idea underlying GCKM is that, because of the unsupervised core, the final model can achieve higher performance in semi-supervised node classification when few labels are available for training. Experimental results demonstrate the effectiveness of the proposed framework.

Unsupervised Neighborhood Propagation Kernel Layers for Semi-supervised Node Classification

TL;DR

Abstract

Paper Structure (41 sections, 12 theorems, 48 equations, 4 figures, 9 tables, 1 algorithm)

This paper contains 41 sections, 12 theorems, 48 equations, 4 figures, 9 tables, 1 algorithm.

Introduction
Contributions
Preliminaries and related work
Graph neural networks
Restricted kernel machines
Kernels in GNNs
GNN inspired shallow kernel learning
Method
The graph convolutional and semi-supervised kernel machine layers as building blocks
Graph convolutional kernel machine layer
Semi-supervised restricted kernel machine layer
Deep graph convolutional kernel machine
Training deep graph convolutional kernel machines
Experiments
Datasets and main setting
...and 26 more sections

Key Result

Lemma 1

The solution to the dual minimization problem: satisfies the same first order conditions for optimality w.r.t. $\bm{H}$ as eq:GCKM_energy when the hyperparameters $\boldsymbol{\Lambda}$ in eq:GCKM_energy are chosen to equal the symmetric part of the Lagrange multipliers $\bm{Z}$ of the equality constraints in eq:GCKM_dual_minimization; i.e., $\b

Figures (4)

Figure 1: A deep GCKM architecture for semi-supervised node classification, consisting of two GCKM layers (GCKM$\ell$1, GCKM$\ell$2) and a Semi-SupRKM layer. In each GCKM$\ell$, the node features are aggregated and then (implicitly) transformed to obtain the error variables. The dual variables are coupled with these error variables by means of conjugate feature duality and serve as input for the next layer. In the final Semi-SupRKM layer, the dual variables directly represent the class labels of the unsupervised nodes.
Figure 2: Train, validation and test accuracies, and cosine similarity score during training on CiteSeer dataset.
Figure 3: Train, validation and test accuracies, and cosine similarity score during training on Chameleon dataset.
Figure 4: Top: train, validation and test accuracies, and cosine similarity score; middle: training loss; and bottom: orthogonality loss, during training on CiteSeer dataset after random initialization.

Theorems & Definitions (22)

Lemma 1
Proposition 2
Remark 3
Definition 4: Graph Convolutional Kernel Machine layer
Lemma 5
Proposition 6
Lemma 7
Lemma 8
Remark 9
Lemma
...and 12 more

Unsupervised Neighborhood Propagation Kernel Layers for Semi-supervised Node Classification

TL;DR

Abstract

Unsupervised Neighborhood Propagation Kernel Layers for Semi-supervised Node Classification

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (22)