Table of Contents
Fetching ...

Graph Partial Label Learning with Potential Cause Discovering

Hang Gao, Jiaguo Yuan, Jiangmeng Li, Peng Qiao, Fengge Wu, Changwen Zheng, Huaping Liu

TL;DR

This work tackles graph representation learning under Partial Label Learning (PLL), where each graph is associated with multiple candidate labels and only one is correct. It introduces Graph Partial Label Learning with Potential Cause Discovering (GPCD), a three-phase framework that pre-trains a GNN, estimates a graph causal subset via potential causes, and performs auxiliary training guided by this subset to mitigate label noise. The approach is supported by theoretical results linking potential causes to the graph causal subset and by extensive experiments on seven graph PLL datasets, showing improved accuracy and robustness to noise and distribution shifts. The method offers a principled way to isolate task-relevant graph structure under PLL, with practical implications for scalable, accurate graph learning in weakly supervised settings.

Abstract

Graph Neural Networks (GNNs) have garnered widespread attention for their potential to address the challenges posed by graph representation learning, which face complex graph-structured data across various domains. However, due to the inherent complexity and interconnectedness of graphs, accurately annotating graph data for training GNNs is extremely challenging. To address this issue, we have introduced Partial Label Learning (PLL) into graph representation learning. PLL is a critical weakly supervised learning problem where each training instance is associated with a set of candidate labels, including the ground-truth label and the additional interfering labels. PLL allows annotators to make errors, which reduces the difficulty of data labeling. Subsequently, we propose a novel graph representation learning method that enables GNN models to effectively learn discriminative information within the context of PLL. Our approach utilizes potential cause extraction to obtain graph data that holds causal relationships with the labels. By conducting auxiliary training based on the extracted graph data, our model can effectively eliminate the interfering information in the PLL scenario. We support the rationale behind our method with a series of theoretical analyses. Moreover, we conduct extensive evaluations and ablation studies on multiple datasets, demonstrating the superiority of our proposed method.

Graph Partial Label Learning with Potential Cause Discovering

TL;DR

This work tackles graph representation learning under Partial Label Learning (PLL), where each graph is associated with multiple candidate labels and only one is correct. It introduces Graph Partial Label Learning with Potential Cause Discovering (GPCD), a three-phase framework that pre-trains a GNN, estimates a graph causal subset via potential causes, and performs auxiliary training guided by this subset to mitigate label noise. The approach is supported by theoretical results linking potential causes to the graph causal subset and by extensive experiments on seven graph PLL datasets, showing improved accuracy and robustness to noise and distribution shifts. The method offers a principled way to isolate task-relevant graph structure under PLL, with practical implications for scalable, accurate graph learning in weakly supervised settings.

Abstract

Graph Neural Networks (GNNs) have garnered widespread attention for their potential to address the challenges posed by graph representation learning, which face complex graph-structured data across various domains. However, due to the inherent complexity and interconnectedness of graphs, accurately annotating graph data for training GNNs is extremely challenging. To address this issue, we have introduced Partial Label Learning (PLL) into graph representation learning. PLL is a critical weakly supervised learning problem where each training instance is associated with a set of candidate labels, including the ground-truth label and the additional interfering labels. PLL allows annotators to make errors, which reduces the difficulty of data labeling. Subsequently, we propose a novel graph representation learning method that enables GNN models to effectively learn discriminative information within the context of PLL. Our approach utilizes potential cause extraction to obtain graph data that holds causal relationships with the labels. By conducting auxiliary training based on the extracted graph data, our model can effectively eliminate the interfering information in the PLL scenario. We support the rationale behind our method with a series of theoretical analyses. Moreover, we conduct extensive evaluations and ablation studies on multiple datasets, demonstrating the superiority of our proposed method.
Paper Structure (35 sections, 5 theorems, 31 equations, 11 figures, 4 tables, 1 algorithm)

This paper contains 35 sections, 5 theorems, 31 equations, 11 figures, 4 tables, 1 algorithm.

Key Result

Theorem 3

Given the $i$-th potential cause $\Gamma_{i}$ of $Y$ and the set of all potential causes $\bigcup_{i \in \mathcal{I}^p } \Gamma^{p}_{i}$ of $Y$, where $\mathcal{I}^p$ denotes the index set of the potential causes, it can be concluded that $G^{*} \subseteq (( \bigcup_{i \in \mathcal{I}^{p} } \Gamma^{

Figures (11)

  • Figure 1: Compared results. For GNN baseline, we use ARMA arma. We adopt Graph Twitter dataset dataset:sst to conduct the experiments.
  • Figure 2: The framework of our proposed method.
  • Figure 3: Visualization of the node-level prediction vector on the Graph-Twitter dataset. Each graph represents a sentence. The subplot titles indicate the sentiment type corresponding to each sample.
  • Figure 4: Statistics of the focus ratio on sentiment words for different methods.
  • Figure 5: T-SNE visualization of the graph features on Graph-Twitter with random label noise. The first and second rows represent the features extracted from the training set and test set, respectively.
  • ...and 6 more figures

Theorems & Definitions (8)

  • Definition 1
  • Definition 2
  • Theorem 3
  • Theorem 5
  • Theorem 6
  • Corollary 7
  • Lemma 8
  • proof