Table of Contents
Fetching ...

Learning Invariant Graph Representations Through Redundant Information

Barproda Halder, Pasan Dissanayake, Sanghamitra Dutta

TL;DR

This work addresses the challenge of out-of-distribution generalization for graph classification by leveraging Partial Information Decomposition to focus on redundant information shared between invariant and spurious subgraphs. It introduces RIG, a multi-level, alternating-optimization framework that disentangles causal from spurious graph components by maximizing redundancy and using a contrastive objective, guided by an environment-assistant. Theoretical connections between SCMs and PID motivate the redaction of redundant information as a central objective, and extensive experiments on synthetic and real-world datasets demonstrate improved OOD robustness over strong baselines. Collectively, the approach provides a principled, information-theoretic mechanism to derive more reliable invariant graph representations for diverse distribution shifts.

Abstract

Learning invariant graph representations for out-of-distribution (OOD) generalization remains challenging because the learned representations often retain spurious components. To address this challenge, this work introduces a new tool from information theory called Partial Information Decomposition (PID) that goes beyond classical information-theoretic measures. We identify limitations in existing approaches for invariant representation learning that solely rely on classical information-theoretic measures, motivating the need to precisely focus on redundant information about the target $Y$ shared between spurious subgraphs $G_s$ and invariant subgraphs $G_c$ obtained via PID. Next, we propose a new multi-level optimization framework that we call -- Redundancy-guided Invariant Graph learning (RIG) -- that maximizes redundant information while isolating spurious and causal subgraphs, enabling OOD generalization under diverse distribution shifts. Our approach relies on alternating between estimating a lower bound of redundant information (which itself requires an optimization) and maximizing it along with additional objectives. Experiments on both synthetic and real-world graph datasets demonstrate the generalization capabilities of our proposed RIG framework.

Learning Invariant Graph Representations Through Redundant Information

TL;DR

This work addresses the challenge of out-of-distribution generalization for graph classification by leveraging Partial Information Decomposition to focus on redundant information shared between invariant and spurious subgraphs. It introduces RIG, a multi-level, alternating-optimization framework that disentangles causal from spurious graph components by maximizing redundancy and using a contrastive objective, guided by an environment-assistant. Theoretical connections between SCMs and PID motivate the redaction of redundant information as a central objective, and extensive experiments on synthetic and real-world datasets demonstrate improved OOD robustness over strong baselines. Collectively, the approach provides a principled, information-theoretic mechanism to derive more reliable invariant graph representations for diverse distribution shifts.

Abstract

Learning invariant graph representations for out-of-distribution (OOD) generalization remains challenging because the learned representations often retain spurious components. To address this challenge, this work introduces a new tool from information theory called Partial Information Decomposition (PID) that goes beyond classical information-theoretic measures. We identify limitations in existing approaches for invariant representation learning that solely rely on classical information-theoretic measures, motivating the need to precisely focus on redundant information about the target shared between spurious subgraphs and invariant subgraphs obtained via PID. Next, we propose a new multi-level optimization framework that we call -- Redundancy-guided Invariant Graph learning (RIG) -- that maximizes redundant information while isolating spurious and causal subgraphs, enabling OOD generalization under diverse distribution shifts. Our approach relies on alternating between estimating a lower bound of redundant information (which itself requires an optimization) and maximizing it along with additional objectives. Experiments on both synthetic and real-world graph datasets demonstrate the generalization capabilities of our proposed RIG framework.

Paper Structure

This paper contains 14 sections, 4 theorems, 21 equations, 9 figures, 4 tables, 1 algorithm.

Key Result

Proposition 1

The total predictive information that the invariant variable $C$ and the spurious variable $S$ contain about the target variable $Y$ decomposes into four nonnegative terms:

Figures (9)

  • Figure 1: Causally-aligned graph neural network.
  • Figure 2: Decomposition of $\mathrm{I}({Y;A,B})$.
  • Figure 3: Graph generation with distribution shifts.
  • Figure 4: Proposed redundancy-based invariant graph learning framework. Highlighted in orange are the components that are being updated in each step.
  • Figure 5: Comparison of PID values across baseline methods with Two-piece dataset {0.8,0.9}.
  • ...and 4 more figures

Theorems & Definitions (9)

  • Definition 1: Unique information bertschinger2014quantifying
  • Proposition 1
  • Lemma 1: FIIF
  • proof
  • Lemma 2: PIIF
  • proof
  • Definition 2: $I_\cap$ measure griffith2014intersection
  • Lemma 3: Noisy Feature
  • proof