Feature Matching Intervention: Leveraging Observational Data for Causal Representation Learning
Haoze Li, Jun Xie
TL;DR
The paper introduces Feature Matching Intervention (FMI), a covariate-matching strategy that emulates perfect interventions in the latent causal graph to identify true causal features from a single training environment. It formalizes a causal latent-graph framework, provides a theoretical minimax guarantee for FMI under a set of structural assumptions, and proposes a validation-based workflow to detect when the learned feature is spurious. Empirical results on synthetic data, Colored MNIST, and WaterBirds show FMI outperforms standard ERM and invariance-based methods, demonstrating strong OOD generalization and feature fidelity. The work advances causal representation learning by enabling intervention-like identifiability without requiring multiple environments, with future directions toward handling multiple spurious features and broader covariate-shift settings.
Abstract
A major challenge in causal discovery from observational data is the absence of perfect interventions, making it difficult to distinguish causal features from spurious ones. We propose an innovative approach, Feature Matching Intervention (FMI), which uses a matching procedure to mimic perfect interventions. We define causal latent graphs, extending structural causal models to latent feature space, providing a framework that connects FMI with causal graph learning. Our feature matching procedure emulates perfect interventions within these causal latent graphs. Theoretical results demonstrate that FMI exhibits strong out-of-distribution (OOD) generalizability. Experiments further highlight FMI's superior performance in effectively identifying causal features solely from observational data.
