Table of Contents
Fetching ...

Uncertainty-Aware Robust Learning on Noisy Graphs

Shuyi Chen, Kaize Ding, Shixiang Zhu

TL;DR

This work tackles robustness of graph neural networks to noisy graph data in semi-supervised node classification by introducing Distributionally Robust Graph Learning (DRGL). DRGL leverages Distributionally Robust Optimization (DRO) with Wasserstein-ball uncertainty sets to minimize the worst-case risk over class-conditioned distributions, yielding Least Favorable Distributions that define the most challenging data under noise. An end-to-end differentiable optimization layer computes these LFDs and jointly updates the graph encoder to produce robust node embeddings; the overall loss combines the worst-case margins across classes. Empirical results on Cora, Citeseer, and Pubmed show DRGL improves predictive accuracy under feature and edge noise while providing meaningful uncertainty quantification via entropies over the LFDs, indicating stronger, more reliable representations for noisy graphs.

Abstract

Graph neural networks (GNNs) have excelled in various graph learning tasks, particularly node classification. However, their performance is often hampered by noisy measurements in real-world graphs, which can corrupt critical patterns in the data. To address this, we propose a novel uncertainty-aware graph learning framework inspired by distributionally robust optimization. Specifically, we use a graph neural network-based encoder to embed the node features and find the optimal node embeddings by minimizing the worst-case risk through a minimax formulation. Such an uncertainty-aware learning process leads to improved node representations and a more robust graph predictive model that effectively mitigates the impact of uncertainty arising from data noise. Our experimental results demonstrate superior predictive performance over baselines across noisy scenarios.

Uncertainty-Aware Robust Learning on Noisy Graphs

TL;DR

This work tackles robustness of graph neural networks to noisy graph data in semi-supervised node classification by introducing Distributionally Robust Graph Learning (DRGL). DRGL leverages Distributionally Robust Optimization (DRO) with Wasserstein-ball uncertainty sets to minimize the worst-case risk over class-conditioned distributions, yielding Least Favorable Distributions that define the most challenging data under noise. An end-to-end differentiable optimization layer computes these LFDs and jointly updates the graph encoder to produce robust node embeddings; the overall loss combines the worst-case margins across classes. Empirical results on Cora, Citeseer, and Pubmed show DRGL improves predictive accuracy under feature and edge noise while providing meaningful uncertainty quantification via entropies over the LFDs, indicating stronger, more reliable representations for noisy graphs.

Abstract

Graph neural networks (GNNs) have excelled in various graph learning tasks, particularly node classification. However, their performance is often hampered by noisy measurements in real-world graphs, which can corrupt critical patterns in the data. To address this, we propose a novel uncertainty-aware graph learning framework inspired by distributionally robust optimization. Specifically, we use a graph neural network-based encoder to embed the node features and find the optimal node embeddings by minimizing the worst-case risk through a minimax formulation. Such an uncertainty-aware learning process leads to improved node representations and a more robust graph predictive model that effectively mitigates the impact of uncertainty arising from data noise. Our experimental results demonstrate superior predictive performance over baselines across noisy scenarios.
Paper Structure (11 sections, 1 theorem, 6 equations, 5 figures, 3 tables, 1 algorithm)

This paper contains 11 sections, 1 theorem, 6 equations, 5 figures, 3 tables, 1 algorithm.

Key Result

Proposition 1

For the uncertainty sets defined in eq:uncertainty-set, the least favorable distribution of problem eq:minmax can be obtained by solving the following problem: The decision variable $\gamma_m \subset \mathbb{R}_{+}^{{n} \times {n}}$ can be viewed as a joint distribution on $n$ empirical points with marginal distributions $\widehat{P}_m$ and $P_m$, represented by a vector $p_m \in \mathbb{R}_{+}^{

Figures (5)

  • Figure 1: An illustration of the uncertainty set in our proposed framework. The goal is to search for the graph distribution that minimizes the worst-case risk.
  • Figure 2: The architecture of the proposed framework consists of two cohesive modules: (1) a graph encoder parameterized by $\theta$, which produces the node embedding $\xi$ given the graph information $\mathcal{G}$; (2) a differential optimization layer, which generates the corresponding least favorable distributions (LFDs) $\{P_m^*\}$ for $\xi$ by solving the convex optimization defined in \ref{['eq:reformulation']}. The loss measures the total margin, obtained by summing up $\max_{1\le m \le M} P_m^*(\xi)$ across node embeddings.
  • Figure 3: The minimax problem \ref{['eq:minmax']} aims to find the least favorable distributions (LFDs) by searching the optimal $P_m$ in the uncertainty set $\mathcal{P}_m$ that maximizes the risk $\Psi$.
  • Figure 4: The impact of noise on the learned feature spaces. (a) and (b) show the embeddings from graphs without noise; (b) and (e) show the embeddings when the graphs have nodal features with $2\sigma$ noise; and (c) and (f) present the representations from graphs where $20\%$ of the edges have been removed.
  • Figure 5: The learned embeddings and the uncertainty of GCN trained with DRGL Darker shades indicate a higher level of uncertainty between the different categories under that LFDs solved by \ref{['eq:reformulation']}.

Theorems & Definitions (2)

  • Proposition 1
  • Remark 1