Multi-View Subgraph Neural Networks: Self-Supervised Learning with Scarce Labeled Data

Zhenzhong Wang; Qingyuan Zeng; Wanyu Lin; Min Jiang; Kay Chen Tan

Multi-View Subgraph Neural Networks: Self-Supervised Learning with Scarce Labeled Data

Zhenzhong Wang, Qingyuan Zeng, Wanyu Lin, Min Jiang, Kay Chen Tan

TL;DR

This paper tackles node classification under severe label scarcity by introducing Muse, a self-supervised framework that fuses two subgraph views—one from the input space capturing local structure and one from a latent Isomap-based space capturing long-range dependencies. Subgraphs are identified via mutual information optimization and combined with node embeddings to produce augmented representations, supervised by a prototypical loss to encode class structure. Theoretical generalization bounds support the approach, and extensive experiments on five datasets show Muse consistently outperforms baselines in low-label settings, with ablations highlighting the importance of both MI-based selection and the dual-view fusion. The work offers a practical pathway to robust graph representations when labeling is expensive, with potential extensions to graph-level tasks and deeper information-theoretic analyses.

Abstract

While graph neural networks (GNNs) have become the de-facto standard for graph-based node classification, they impose a strong assumption on the availability of sufficient labeled samples. This assumption restricts the classification performance of prevailing GNNs on many real-world applications suffering from low-data regimes. Specifically, features extracted from scarce labeled nodes could not provide sufficient supervision for the unlabeled samples, leading to severe over-fitting. In this work, we point out that leveraging subgraphs to capture long-range dependencies can augment the representation of a node with homophily properties, thus alleviating the low-data regime. However, prior works leveraging subgraphs fail to capture the long-range dependencies among nodes. To this end, we present a novel self-supervised learning framework, called multi-view subgraph neural networks (Muse), for handling long-range dependencies. In particular, we propose an information theory-based identification mechanism to identify two types of subgraphs from the views of input space and latent space, respectively. The former is to capture the local structure of the graph, while the latter captures the long-range dependencies among nodes. By fusing these two views of subgraphs, the learned representations can preserve the topological properties of the graph at large, including the local structure and long-range dependencies, thus maximizing their expressiveness for downstream node classification tasks. Experimental results show that Muse outperforms the alternative methods on node classification tasks with limited labeled data.

Multi-View Subgraph Neural Networks: Self-Supervised Learning with Scarce Labeled Data

TL;DR

Abstract

Paper Structure (20 sections, 2 theorems, 30 equations, 7 figures, 6 tables, 1 algorithm)

This paper contains 20 sections, 2 theorems, 30 equations, 7 figures, 6 tables, 1 algorithm.

Introduction
Related Work
Self-supervised Learning
Multi-View Graph Learning
Manifold Assumption
Proposed Algorithm
Node Representation Learning
Self-supervised Multi-View Subgraph Augmentation
Learning Objective
Theoretical Analysis
Computational Complexity
Experimental Studies
Benchmark Datasets
Baselines
Experimental Setup
...and 5 more sections

Key Result

Theorem 1

(Rademacher complexity bound of neural networkslu2021rademacher). Assuming that the neural network has $d$ layers with parameter matrices $\mathbf{W}_1,...,\mathbf{W}_d$ that are at most $\mathbf{M}_1,...,\mathbf{M}_d$, and the activation functions are 1-Lipschitz, positive-homogeneous. Let $x$ is u

Figures (7)

Figure 1: (a) Unlabeled nodes (in grey color) have high confidence to be predicted due to sufficient labeled nodes. (b) Unlabeled nodes are distant from the labeled nodes with homophily properties, thus having low confidence in predicting the class of the unlabeled nodes.
Figure 2: (a) The fixed hop limits the receptive field of the subgraph. (b) The structural information captured by the subgraph depends on the size of the subgraph. (c) Fusing two views of subgraphs can capture not only local structural information but also long-range dependencies.
Figure 3: Step 1. The raw graph is reconstructed as a latent graph in which distant yet informative nodes can be mapped close, where grey nodes denote unlabeled nodes and colorful nodes denote labeled nodes. Step 2. A graph embedding network is employed to extract the naive embedding and the latent embedding from the raw graph and latent graph, respectively. Step 3. By maximizing mutual information, the naive subgraph and latent subgraph are respectively extracted from the naive embedding and the latent embedding. Step 4. Different embedding is fused together to achieve data augmentation, and the fused embedding is then used for calculating the classification loss. Step 5. To leverage the inductive bias of different topological structures, a prototypical loss is derived by different subgraphs and node embedding.
Figure 4: t-SNE visualization of features derived by GCN, GCN- Muse, GraphSAGE, and GraphSAGE- Muse under 1 label per class setting. The features learned by GCN- Muse and GraphSAGE- Muse have compact clusters and clear boundaries.
Figure 5: Document classification accuracy with different hyper-parameters on the Cora dataset. (a) Results with different parameters $\lambda_p$. (b) Results with different numbers of layers. (c) Results with different parameters $k$. (d) Results with different parameters $\tau$.
...and 2 more figures

Theorems & Definitions (4)

Definition 1
Theorem 1
Definition 2
Theorem 2

Multi-View Subgraph Neural Networks: Self-Supervised Learning with Scarce Labeled Data

TL;DR

Abstract

Multi-View Subgraph Neural Networks: Self-Supervised Learning with Scarce Labeled Data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (4)