Table of Contents
Fetching ...

Learning Counterfactual Outcomes Under Rank Preservation

Peng Wu, Haoxuan Li, Chunyuan Zheng, Yan Zeng, Jiawei Chen, Yang Liu, Ruocheng Guo, Kun Zhang

TL;DR

This work tackles individual counterfactual inference without a known structural causal model by introducing rank preservation as an identifiability condition, linking $Y_x$ and $Y_{x'}$ through their conditional rankings. It replaces prior SCM-dependent or bi-level quantile regressions with a convex ideal loss $R_{x'}(t|x,z,y)$ and a kernel-based IPW estimator that achieve unbiased learning of $y_{x'}$ under $Z=z$, with provable consistency. Theoretical results establish convexity, unbiasedness, and bias characterization of the estimator, and the method extends to continuous treatments. Empirical results on semi-synthetic IHDP data and real Jobs data show the approach outperforms baselines, especially out-of-sample, demonstrating robustness to rank violations and providing a practical tool for fine-grained counterfactual analysis under weaker assumptions.

Abstract

Counterfactual inference aims to estimate the counterfactual outcome at the individual level given knowledge of an observed treatment and the factual outcome, with broad applications in fields such as epidemiology, econometrics, and management science. Previous methods rely on a known structural causal model (SCM) or assume the homogeneity of the exogenous variable and strict monotonicity between the outcome and exogenous variable. In this paper, we propose a principled approach for identifying and estimating the counterfactual outcome. We first introduce a simple and intuitive rank preservation assumption to identify the counterfactual outcome without relying on a known structural causal model. Building on this, we propose a novel ideal loss for theoretically unbiased learning of the counterfactual outcome and further develop a kernel-based estimator for its empirical estimation. Our theoretical analysis shows that the rank preservation assumption is not stronger than the homogeneity and strict monotonicity assumptions, and shows that the proposed ideal loss is convex, and the proposed estimator is unbiased. Extensive semi-synthetic and real-world experiments are conducted to demonstrate the effectiveness of the proposed method.

Learning Counterfactual Outcomes Under Rank Preservation

TL;DR

This work tackles individual counterfactual inference without a known structural causal model by introducing rank preservation as an identifiability condition, linking and through their conditional rankings. It replaces prior SCM-dependent or bi-level quantile regressions with a convex ideal loss and a kernel-based IPW estimator that achieve unbiased learning of under , with provable consistency. Theoretical results establish convexity, unbiasedness, and bias characterization of the estimator, and the method extends to continuous treatments. Empirical results on semi-synthetic IHDP data and real Jobs data show the approach outperforms baselines, especially out-of-sample, demonstrating robustness to rank violations and providing a practical tool for fine-grained counterfactual analysis under weaker assumptions.

Abstract

Counterfactual inference aims to estimate the counterfactual outcome at the individual level given knowledge of an observed treatment and the factual outcome, with broad applications in fields such as epidemiology, econometrics, and management science. Previous methods rely on a known structural causal model (SCM) or assume the homogeneity of the exogenous variable and strict monotonicity between the outcome and exogenous variable. In this paper, we propose a principled approach for identifying and estimating the counterfactual outcome. We first introduce a simple and intuitive rank preservation assumption to identify the counterfactual outcome without relying on a known structural causal model. Building on this, we propose a novel ideal loss for theoretically unbiased learning of the counterfactual outcome and further develop a kernel-based estimator for its empirical estimation. Our theoretical analysis shows that the rank preservation assumption is not stronger than the homogeneity and strict monotonicity assumptions, and shows that the proposed ideal loss is convex, and the proposed estimator is unbiased. Extensive semi-synthetic and real-world experiments are conducted to demonstrate the effectiveness of the proposed method.

Paper Structure

This paper contains 21 sections, 10 theorems, 54 equations, 2 figures, 4 tables.

Key Result

Lemma 3.3

Under Assumptions assump1-assump2, $y_{x'}$ is identifiable.

Figures (2)

  • Figure 1: Estimation performance of individual treatment effects under varying heterogeneity degrees.
  • Figure 2: The estimation performance with different kernels and bandwidths.

Theorems & Definitions (22)

  • Lemma 3.3
  • Definition 4.1: Kendall Kendall1938
  • Proposition 4.3
  • Proposition 4.4
  • Definition 4.5: Kendall Kendall1945
  • Proposition 4.7
  • Lemma 5.1
  • Theorem 5.2: Validity of the Proposed Ideal Loss
  • Proposition 5.3: Consistency
  • Theorem 5.4: Unbiasedness Preservation
  • ...and 12 more