Table of Contents
Fetching ...

Adapt, Agree, Aggregate: Semi-Supervised Ensemble Labeling for Graph Convolutional Networks

Maryam Abdolali, Romina Zakerian, Behnam Roshanfekr, Fardin Ayar, Mohammad Rahmati

TL;DR

A novel framework that combines ensemble learning with augmented graph structures to improve the performance and robustness of semi-supervised node classification in graphs and captures subtle patterns that individual models often overlook, enabling the model to generalize better.

Abstract

In this paper, we propose a novel framework that combines ensemble learning with augmented graph structures to improve the performance and robustness of semi-supervised node classification in graphs. By creating multiple augmented views of the same graph, our approach harnesses the "wisdom of a diverse crowd", mitigating the challenges posed by noisy graph structures. Leveraging ensemble learning allows us to simultaneously achieve three key goals: adaptive confidence threshold selection based on model agreement, dynamic determination of the number of high-confidence samples for training, and robust extraction of pseudo-labels to mitigate confirmation bias. Our approach uniquely integrates adaptive ensemble consensus to flexibly guide pseudo-label extraction and sample selection, reducing the risks of error accumulation and improving robustness. Furthermore, the use of ensemble-driven consensus for pseudo-labeling captures subtle patterns that individual models often overlook, enabling the model to generalize better. Experiments on several real-world datasets demonstrate the effectiveness of our proposed method.

Adapt, Agree, Aggregate: Semi-Supervised Ensemble Labeling for Graph Convolutional Networks

TL;DR

A novel framework that combines ensemble learning with augmented graph structures to improve the performance and robustness of semi-supervised node classification in graphs and captures subtle patterns that individual models often overlook, enabling the model to generalize better.

Abstract

In this paper, we propose a novel framework that combines ensemble learning with augmented graph structures to improve the performance and robustness of semi-supervised node classification in graphs. By creating multiple augmented views of the same graph, our approach harnesses the "wisdom of a diverse crowd", mitigating the challenges posed by noisy graph structures. Leveraging ensemble learning allows us to simultaneously achieve three key goals: adaptive confidence threshold selection based on model agreement, dynamic determination of the number of high-confidence samples for training, and robust extraction of pseudo-labels to mitigate confirmation bias. Our approach uniquely integrates adaptive ensemble consensus to flexibly guide pseudo-label extraction and sample selection, reducing the risks of error accumulation and improving robustness. Furthermore, the use of ensemble-driven consensus for pseudo-labeling captures subtle patterns that individual models often overlook, enabling the model to generalize better. Experiments on several real-world datasets demonstrate the effectiveness of our proposed method.

Paper Structure

This paper contains 25 sections, 8 equations, 8 figures, 3 tables, 1 algorithm.

Figures (8)

  • Figure 1: Overall schematic of our approach. The given graph $G(V, E)$ is first augmented to generate $k$ similar, yet diverse sets of 'views' of the graphs, which are then assigned to $k$ GCN models independently. The feature matrix $X$ remains consistent across all models. In each epoch, the outputs of these models serve two purposes: setting the adaptive threshold for high-confidence samples and selecting a set of nodes to be assigned pseudo-labels. These selected pseudo-labels are used as supervision for each of $k$ models. The outputs are then passed to an agreement module, which identifies the consensus pseudo-labels, which are subsequently provided to a final consensus GCN model along with the original graph.
  • Figure 2: Two-dimensional embeddings of the nodes and the spread of high confident nodes (in red) and agreed nodes (in green).
  • Figure 3: ratio of correct pseudo-labels for A3-GCN and the conservative baseline across epochs.
  • Figure 4: Comparison of the accuracy of individual models vs the consensus model in each epoch for (a) Cora, (b) Citeseer and (c) Pubmed dataset.
  • Figure 5: The adaptive values of $\theta_{conf}$ across different epochs for various datasets typically decrease rapidly in the initial epochs and then converge to a stable value in the later epochs.
  • ...and 3 more figures