Spectral Representation for Causal Estimation with Hidden Confounders
Haotian Sun, Antoine Moulin, Tongzheng Ren, Arthur Gretton, Bo Dai
TL;DR
This work tackles causal effect estimation in the presence of hidden confounders, focusing on IV regression with observed confounders and Proxy Causal Learning (PCL). It introduces a spectral representation built on a low-rank factorization of the conditional expectation operator and trains via a saddle-point objective to learn the causal function and its dual, enabling efficient representation learning and avoiding double-sampling bias. The authors derive explicit function classes for IV, IV-OC, and PCL, propose contrastive learning to discover the spectral bases, and develop practical algorithms (SpecIV and SpecPCL) with finite-dimensional representations. Empirical results on dSprites and Demand Design show that SpecIV/SpecPCL achieve state-of-the-art accuracy and efficiency across IV and PCL benchmarks, validating the approach's scalability and robustness.
Abstract
We address the problem of causal effect estimation where hidden confounders are present, with a focus on two settings: instrumental variable regression with additional observed confounders, and proxy causal learning. Our approach uses a singular value decomposition of a conditional expectation operator, followed by a saddle-point optimization problem, which, in the context of IV regression, can be thought of as a neural net generalization of the seminal approach due to Darolles et al. [2011]. Saddle-point formulations have gathered considerable attention recently, as they can avoid double sampling bias and are amenable to modern function approximation methods. We provide experimental validation in various settings, and show that our approach outperforms existing methods on common benchmarks.
