Table of Contents
Fetching ...

Linear Opinion Pooling for Uncertainty Quantification on Graphs

Clemens Damke, Eyke Hüllermeier

TL;DR

This work tackles uncertainty quantification for graph-structured semi-supervised node classification by challenging the irreducibility assumption in Graph Posterior Networks and introducing LopGPN, which represents epistemic uncertainty with mixtures of Dirichlet distributions and propagates it through the graph using linear opinion pooling guided by Personalized PageRank. The method derives a tractable training objective via entropy-bounded surrogates for the mixture, and scales with sparse APPNP to large graphs. Empirical results across multiple graph datasets show LopGPN achieving strong predictive accuracy and more reliable uncertainty estimates, including robust out-of-distribution detection, highlighting the approach as a principled and scalable way to quantify uncertainty on graphs. The work also clarifies the limitations of existing GPN axioms in real-world networks and points to future directions in combining epistemic and aleatoric information for uncertainty propagation on graphs.

Abstract

We address the problem of uncertainty quantification for graph-structured data, or, more specifically, the problem to quantify the predictive uncertainty in (semi-supervised) node classification. Key questions in this regard concern the distinction between two different types of uncertainty, aleatoric and epistemic, and how to support uncertainty quantification by leveraging the structural information provided by the graph topology. Challenging assumptions and postulates of state-of-the-art methods, we propose a novel approach that represents (epistemic) uncertainty in terms of mixtures of Dirichlet distributions and refers to the established principle of linear opinion pooling for propagating information between neighbored nodes in the graph. The effectiveness of this approach is demonstrated in a series of experiments on a variety of graph-structured datasets.

Linear Opinion Pooling for Uncertainty Quantification on Graphs

TL;DR

This work tackles uncertainty quantification for graph-structured semi-supervised node classification by challenging the irreducibility assumption in Graph Posterior Networks and introducing LopGPN, which represents epistemic uncertainty with mixtures of Dirichlet distributions and propagates it through the graph using linear opinion pooling guided by Personalized PageRank. The method derives a tractable training objective via entropy-bounded surrogates for the mixture, and scales with sparse APPNP to large graphs. Empirical results across multiple graph datasets show LopGPN achieving strong predictive accuracy and more reliable uncertainty estimates, including robust out-of-distribution detection, highlighting the approach as a principled and scalable way to quantify uncertainty on graphs. The work also clarifies the limitations of existing GPN axioms in real-world networks and points to future directions in combining epistemic and aleatoric information for uncertainty propagation on graphs.

Abstract

We address the problem of uncertainty quantification for graph-structured data, or, more specifically, the problem to quantify the predictive uncertainty in (semi-supervised) node classification. Key questions in this regard concern the distinction between two different types of uncertainty, aleatoric and epistemic, and how to support uncertainty quantification by leveraging the structural information provided by the graph topology. Challenging assumptions and postulates of state-of-the-art methods, we propose a novel approach that represents (epistemic) uncertainty in terms of mixtures of Dirichlet distributions and refers to the established principle of linear opinion pooling for propagating information between neighbored nodes in the graph. The effectiveness of this approach is demonstrated in a series of experiments on a variety of graph-structured datasets.
Paper Structure (18 sections, 15 equations, 4 figures, 1 table)

This paper contains 18 sections, 15 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: *tu, *au, and *eu for three second-order distributions on $\Delta_2$, namely, $Q_1 = \mathrm{Beta}(5,5)$, $Q_2 = \mathcal{U}[0,1]$, and $Q_3 = \frac{1}{2}\mathrm{Beta}(100,10) + \frac{1}{2}\mathrm{Beta}(10,100)$.
  • Figure 2: Illustration of how *gpn aggregates the two conflicting predictions for A and B with low *au and low *eu into predictions with high *au and low *eu.
  • Figure 3: Illustration of how *lopgpn preserves the *au of conflicting predictions.
  • Figure 4: *arc for different uncertainty measures. The x-axis represents the fraction of rejected test instances; the y-axis represents the test accuracy for a given rejection rate. The (small) shaded areas behind the curves represent the estimate's standard error.