Contrastive representations of high-dimensional, structured treatments
Oriol Corcoll Andreu, Athanasios Vlontzos, Michael O'Riordan, Ciaran M. Gilligan-Lee
TL;DR
The paper addresses causal effect estimation when treatments are high-dimensional and structured, showing that naïve use of such treatments can bias results due to non-causal latent factors. It introduces a contrastive representation learning approach to extract a causally relevant treatment latent $T_C$, yielding an unbiased CATE $ au(T,T',x) = E(Y\mid do(T),X=x) - E(Y\mid do(T'),X=x)$. The authors prove that the learned representation $\psi(T)$ identifies the causal latents and discards non-causal ones, providing theoretical guarantees via a contrastive framework and Theorem 4.2, and validate the approach on synthetic and real datasets where it outperforms existing methods. This work enables robust causal inference in domains with complex treatments (e.g., text, graphs, molecules) and has practical implications for recommendation systems and drug discovery.
Abstract
Estimating causal effects is vital for decision making. In standard causal effect estimation, treatments are usually binary- or continuous-valued. However, in many important real-world settings, treatments can be structured, high-dimensional objects, such as text, video, or audio. This provides a challenge to traditional causal effect estimation. While leveraging the shared structure across different treatments can help generalize to unseen treatments at test time, we show in this paper that using such structure blindly can lead to biased causal effect estimation. We address this challenge by devising a novel contrastive approach to learn a representation of the high-dimensional treatments, and prove that it identifies underlying causal factors and discards non-causally relevant factors. We prove that this treatment representation leads to unbiased estimates of the causal effect, and empirically validate and benchmark our results on synthetic and real-world datasets.
