Table of Contents
Fetching ...

Semiparametric Counterfactual Regression

Kwangho Kim

TL;DR

This work tackles counterfactual regression under shifts in treatment patterns by introducing incremental interventions that identify $E\{Y^{Q(\delta)}\}$ under consistency and no unmeasured confounding. It develops a doubly robust, semiparametric framework that projects the counterfactual regression function $Y^{Q(\delta)}$ onto a finite‑dimensional model $f(X;\beta)$ subject to flexible constraints, and solves an EIF‑based approximating program to obtain $\hat{\beta}$ via cross‑fitting and modern optimization. Theoretical results establish $\sqrt{n}$‑consistency and asymptotic normality for a broad class of losses and constraints, including fairness considerations, with specialized analyses for smooth fixed feasible sets and for separable stochastic components with linear constraints. Empirically, the method demonstrates strong adaptation to unseen counterfactual scenarios and favorable convergence properties, highlighting its potential for deployment in domain adaptation, policy evaluation, and constrained predictive modeling under distributional shifts.

Abstract

We study counterfactual regression, which aims to map input features to outcomes under hypothetical scenarios that differ from those observed in the data. This is particularly useful for decision-making when adapting to sudden shifts in treatment patterns is essential. We propose a doubly robust-style estimator for counterfactual regression within a generalizable framework that accommodates a broad class of risk functions and flexible constraints, drawing on tools from semiparametric theory and stochastic optimization. Our approach uses incremental interventions to enhance adaptability while maintaining consistency with standard methods. We formulate the target estimand as the optimal solution to a stochastic optimization problem and develop an efficient estimation strategy, where we can leverage rapid development of modern optimization algorithms. We go on to analyze the rates of convergence and characterize the asymptotic distributions. Our analysis shows that the proposed estimators can achieve $\sqrt{n}$-consistency and asymptotic normality for a broad class of problems. Numerical illustrations highlight their effectiveness in adapting to unseen counterfactual scenarios while maintaining parametric convergence rates.

Semiparametric Counterfactual Regression

TL;DR

This work tackles counterfactual regression under shifts in treatment patterns by introducing incremental interventions that identify under consistency and no unmeasured confounding. It develops a doubly robust, semiparametric framework that projects the counterfactual regression function onto a finite‑dimensional model subject to flexible constraints, and solves an EIF‑based approximating program to obtain via cross‑fitting and modern optimization. Theoretical results establish ‑consistency and asymptotic normality for a broad class of losses and constraints, including fairness considerations, with specialized analyses for smooth fixed feasible sets and for separable stochastic components with linear constraints. Empirically, the method demonstrates strong adaptation to unseen counterfactual scenarios and favorable convergence properties, highlighting its potential for deployment in domain adaptation, policy evaluation, and constrained predictive modeling under distributional shifts.

Abstract

We study counterfactual regression, which aims to map input features to outcomes under hypothetical scenarios that differ from those observed in the data. This is particularly useful for decision-making when adapting to sudden shifts in treatment patterns is essential. We propose a doubly robust-style estimator for counterfactual regression within a generalizable framework that accommodates a broad class of risk functions and flexible constraints, drawing on tools from semiparametric theory and stochastic optimization. Our approach uses incremental interventions to enhance adaptability while maintaining consistency with standard methods. We formulate the target estimand as the optimal solution to a stochastic optimization problem and develop an efficient estimation strategy, where we can leverage rapid development of modern optimization algorithms. We go on to analyze the rates of convergence and characterize the asymptotic distributions. Our analysis shows that the proposed estimators can achieve -consistency and asymptotic normality for a broad class of problems. Numerical illustrations highlight their effectiveness in adapting to unseen counterfactual scenarios while maintaining parametric convergence rates.

Paper Structure

This paper contains 14 sections, 6 theorems, 59 equations, 2 figures.

Key Result

Theorem 4.1

Assume that eqn:smooth-fixed-true has a unique optimal solution $\beta^*$ (i.e., $\mathsf{s}^*(eqn:smooth-fixed-true)$ is a singleton), that eqn:objective-root-n-CAN holds, and that Assumptions assumption:LICQ-SC, assumption:quadratic-growth are satisfied. Then, where $\mathsf{B} = \left[\nabla_\beta g_j(\beta^*)^\top, \, j \in J_0(\beta^*) \right]$ and $\Sigma_{\beta^*}=\nabla^2_{\beta} \uppsi(\

Figures (2)

  • Figure 1: (a) Density of the intervention distribution $q(X; \delta, \pi)$ for varying values of $\delta$; (b) and (c) display the densities of the true counterfactual outcomes and the predicted outcomes from factual and proposed counterfactual regression methods for $\delta=0.1$ and $\delta=0.01$, respectively. The factual regression exhibits a notable distributional discrepancy from the true counterfactual outcomes, resulting in substantial estimation errors.
  • Figure 2: RMSE versus nuisance convergence rates for (a) $\delta=0.1$ and (b) $\delta=0.01$. The proposed estimators consistently attain convergence rates faster than those of the nuisance components.

Theorems & Definitions (19)

  • Remark 2.1
  • Example 1A: Constrained regression
  • Example 1B: Algorithmic fairness
  • Example 1C: Balance for the positive class
  • Example 2A: Cross-entropy loss
  • Example 2B: Mean squared logarithmic loss
  • Example 2C: $L_2$ loss with fairness criterion
  • Definition 4.1: LICQ
  • Definition 4.2: SC
  • Theorem 4.1
  • ...and 9 more