Table of Contents
Fetching ...

Uncertainty Estimation for Heterophilic Graphs Through the Lens of Information Theory

Dominik Fuchsgruber, Tom Wollschläger, Johannes Bordne, Stephan Günnemann

TL;DR

This work addresses uncertainty estimation on heterophilic graphs, where traditional homophily-driven methods struggle. By casting MPNNs in an information-theoretic framework, it derives a Data Processing Equality analogue that allows information about the target to be redistributed across layers, revealing that deeper representations can carry unique, complementary information on heterophilic graphs. It introduces JLDE, a simple post-hoc KNN-based density estimator over the joint latent space of all layer embeddings, which achieves state-of-the-art epistemic uncertainty on heterophilic datasets while matching homophilic baselines without diffusion. The results demonstrate that exploiting information from all latent representations is crucial for reliable uncertainty estimation in non-i.i.d. graph settings and programmatically validates a principled design guideline for uncertainty in GNNs beyond homophily.

Abstract

While uncertainty estimation for graphs recently gained traction, most methods rely on homophily and deteriorate in heterophilic settings. We address this by analyzing message passing neural networks from an information-theoretic perspective and developing a suitable analog to data processing inequality to quantify information throughout the model's layers. In contrast to non-graph domains, information about the node-level prediction target can increase with model depth if a node's features are semantically different from its neighbors. Therefore, on heterophilic graphs, the latent embeddings of an MPNN each provide different information about the data distribution - different from homophilic settings. This reveals that considering all node representations simultaneously is a key design principle for epistemic uncertainty estimation on graphs beyond homophily. We empirically confirm this with a simple post-hoc density estimator on the joint node embedding space that provides state-of-the-art uncertainty on heterophilic graphs. At the same time, it matches prior work on homophilic graphs without explicitly exploiting homophily through post-processing.

Uncertainty Estimation for Heterophilic Graphs Through the Lens of Information Theory

TL;DR

This work addresses uncertainty estimation on heterophilic graphs, where traditional homophily-driven methods struggle. By casting MPNNs in an information-theoretic framework, it derives a Data Processing Equality analogue that allows information about the target to be redistributed across layers, revealing that deeper representations can carry unique, complementary information on heterophilic graphs. It introduces JLDE, a simple post-hoc KNN-based density estimator over the joint latent space of all layer embeddings, which achieves state-of-the-art epistemic uncertainty on heterophilic datasets while matching homophilic baselines without diffusion. The results demonstrate that exploiting information from all latent representations is crucial for reliable uncertainty estimation in non-i.i.d. graph settings and programmatically validates a principled design guideline for uncertainty in GNNs beyond homophily.

Abstract

While uncertainty estimation for graphs recently gained traction, most methods rely on homophily and deteriorate in heterophilic settings. We address this by analyzing message passing neural networks from an information-theoretic perspective and developing a suitable analog to data processing inequality to quantify information throughout the model's layers. In contrast to non-graph domains, information about the node-level prediction target can increase with model depth if a node's features are semantically different from its neighbors. Therefore, on heterophilic graphs, the latent embeddings of an MPNN each provide different information about the data distribution - different from homophilic settings. This reveals that considering all node representations simultaneously is a key design principle for epistemic uncertainty estimation on graphs beyond homophily. We empirically confirm this with a simple post-hoc density estimator on the joint node embedding space that provides state-of-the-art uncertainty on heterophilic graphs. At the same time, it matches prior work on homophilic graphs without explicitly exploiting homophily through post-processing.

Paper Structure

This paper contains 29 sections, 7 theorems, 24 equations, 13 figures, 14 tables, 1 algorithm.

Key Result

Theorem 4.1

Let $\textnormal{Z}^{(i)}$ be random variables corresponding to the hidden representations of a node $v$ in an MPNN according to eq:message_passing after the $i$-th layer. Let $\textnormal{G}_{v} = \sqcup_i \textnormal{G}_{v}^{(i)}$ and ${\textnormal{Y}}$ random variables representing the correspond

Figures (13)

  • Figure 1: Information propagation in MPNNs. In the $i$-th iteration, information about $i$-hop neighbors of the anchor node $v$ is gained ($\Delta_{(+)}^{(i)}$) while information about neighbors at smaller distances may be lost to processing ($\Delta_{(-)}^{(0:i-1)}$). In heterophilic graphs, the ${\mathcal{G}}_{v}^{(i)}$ are semantically different and each latent representation $\textnormal{Z}^{(i)}$ often contains different information about the target ${\textnormal{Y}}$. Therefore, density-based uncertainty must be estimated jointly from all $\textnormal{Z}^{(i)}$ to fully capture all available information about the data distribution.
  • Figure 2: NNs for i.i.d. data induce a Markov Chain of random variables $\textnormal{Z}^{(i)}$ that correspond to its hidden representations.
  • Figure 3: Probabilistic model of how the information is processed in MPNNs. The $i$-th layer updates the representations of each $k$-order ego graph ${\mathcal{G}}_{v}^{(k)}$ described by random variables $\textnormal{Z}^{(i)}_{k}$.
  • Figure 4: O.o.d. detection AUC-ROC ($\uparrow$) under a leave-out-classes distribution shift when estimating the density from different layers of a Res-GCN backbone. On heterophilic datasets (Amazon Ratings, Roman Empire), joint density estimation performs best as different layer representations provide different information.
  • Figure 5: Compatibility matrices for each dataset.
  • ...and 8 more figures

Theorems & Definitions (14)

  • Theorem 4.1: Data Processing Equality for MPNNs
  • Definition 4.1
  • Proposition 4.1
  • Definition 4.2
  • Definition 4.3
  • Proposition 4.2
  • Theorem 1.1: Data Processing Equality for MPNNs
  • proof
  • Proposition 1.1
  • proof
  • ...and 4 more