Table of Contents
Fetching ...

Perturbation-based Graph Active Learning for Weakly-Supervised Belief Representation Learning

Dachun Sun, Ruijie Wang, Jinning Li, Ruipeng Han, Xinyi Liu, You Lyu, Tarek Abdelzaher

TL;DR

A graph data augmentation-inspired perturbation-based active learning strategy (PerbALGraph) that progressively selects messages for labeling according to an automatic estimator, obviating human guidance is proposed.

Abstract

This paper addresses the problem of optimizing the allocation of labeling resources for semi-supervised belief representation learning in social networks. The objective is to strategically identify valuable messages on social media graphs that are worth labeling within a constrained budget, ultimately maximizing the task's performance. Despite the progress in unsupervised or semi-supervised methods in advancing belief and ideology representation learning on social networks and the remarkable efficacy of graph learning techniques, the availability of high-quality curated labeled social data can greatly benefit and further improve performances. Consequently, allocating labeling efforts is a critical research problem in scenarios where labeling resources are limited. This paper proposes a graph data augmentation-inspired perturbation-based active learning strategy (PerbALGraph) that progressively selects messages for labeling according to an automatic estimator, obviating human guidance. This estimator is based on the principle that messages in the network that exhibit heightened sensitivity to structural features of the observational data indicate landmark quality that significantly influences semi-supervision processes. We design the estimator to be the prediction variance under a set of designed graph perturbations, which is model-agnostic and application-independent. Extensive experiment results demonstrate the effectiveness of the proposed strategy for belief representation learning tasks.

Perturbation-based Graph Active Learning for Weakly-Supervised Belief Representation Learning

TL;DR

A graph data augmentation-inspired perturbation-based active learning strategy (PerbALGraph) that progressively selects messages for labeling according to an automatic estimator, obviating human guidance is proposed.

Abstract

This paper addresses the problem of optimizing the allocation of labeling resources for semi-supervised belief representation learning in social networks. The objective is to strategically identify valuable messages on social media graphs that are worth labeling within a constrained budget, ultimately maximizing the task's performance. Despite the progress in unsupervised or semi-supervised methods in advancing belief and ideology representation learning on social networks and the remarkable efficacy of graph learning techniques, the availability of high-quality curated labeled social data can greatly benefit and further improve performances. Consequently, allocating labeling efforts is a critical research problem in scenarios where labeling resources are limited. This paper proposes a graph data augmentation-inspired perturbation-based active learning strategy (PerbALGraph) that progressively selects messages for labeling according to an automatic estimator, obviating human guidance. This estimator is based on the principle that messages in the network that exhibit heightened sensitivity to structural features of the observational data indicate landmark quality that significantly influences semi-supervision processes. We design the estimator to be the prediction variance under a set of designed graph perturbations, which is model-agnostic and application-independent. Extensive experiment results demonstrate the effectiveness of the proposed strategy for belief representation learning tasks.

Paper Structure

This paper contains 24 sections, 8 equations, 7 figures, 2 tables, 1 algorithm.

Figures (7)

  • Figure 1: Overview of PerbALGraph framwork.
  • Figure 2: Macro F1 score curve of belief representation models trained in a semi-supervised fashion on AL-queried nodes under different budget constraints. The dotted flat gray line represents the performance of the model trained in an unsupervised fashion.
  • Figure 3: Visualization of EDCA dataset with 20 queried and testing nodes
  • Figure 4: Visualization of Russia/Ukraine Conflict (Visual) dataset with 20 queried nodes and testing nodes
  • Figure 5: Comparison of queried visual assertions in the Russia/Ukraine Conflict dataset by different AL methods. Results are corresponding to the graph visualization in Figure \ref{['fig:examine_vis']}.
  • ...and 2 more figures