Table of Contents
Fetching ...

Feasible Fusion: Constrained Joint Estimation under Structural Non-Overlap

Yuxi Du, Zhiheng Zhang, Haoxuan Li, Cong Fang, Jixing Xu, Peng Zhen, Jiecheng Guo

TL;DR

A constrained joint estimation framework that minimizes observational risk while enforcing causal validity through orthogonal experimental moment conditions is proposed and a penalized primaldual algorithm is derived that jointly learns representations and predictors, and establishes oracle inequalities decomposing error into overlap recovery, moment violation, and statistical terms.

Abstract

Causal inference in modern largescale systems faces growing challenges, including highdimensional covariates, multi-valued treatments, massive observational (OBS) data, and limited randomized controlled trial (RCT) samples due to cost constraints. We formalize treatment-induced structural non-overlap and show that, under this regime, commonly used weighted fusion methods provably fail to satisfy randomized identifying restrictions.To address this issue,we propose a constrained joint estimation framework that minimizes observational risk while enforcing causal validity through orthogonal experimental moment conditions. We further show that structural non-overlap creates a feasibility obstruction for moment enforcement in the original covariate space.We also derive a penalized primaldual algorithm that jointly learns representations and predictors, and establish oracle inequalities decomposing error into overlap recovery, moment violation, and statistical terms.Extensive synthetic experiments demonstrate robust performance under varying degrees of nonoverlap. A largescale ridehailing application shows that our method achieves substantial gains over existing baselines, matching the performance of models trained with significantly more RCT data.

Feasible Fusion: Constrained Joint Estimation under Structural Non-Overlap

TL;DR

A constrained joint estimation framework that minimizes observational risk while enforcing causal validity through orthogonal experimental moment conditions is proposed and a penalized primaldual algorithm is derived that jointly learns representations and predictors, and establishes oracle inequalities decomposing error into overlap recovery, moment violation, and statistical terms.

Abstract

Causal inference in modern largescale systems faces growing challenges, including highdimensional covariates, multi-valued treatments, massive observational (OBS) data, and limited randomized controlled trial (RCT) samples due to cost constraints. We formalize treatment-induced structural non-overlap and show that, under this regime, commonly used weighted fusion methods provably fail to satisfy randomized identifying restrictions.To address this issue,we propose a constrained joint estimation framework that minimizes observational risk while enforcing causal validity through orthogonal experimental moment conditions. We further show that structural non-overlap creates a feasibility obstruction for moment enforcement in the original covariate space.We also derive a penalized primaldual algorithm that jointly learns representations and predictors, and establish oracle inequalities decomposing error into overlap recovery, moment violation, and statistical terms.Extensive synthetic experiments demonstrate robust performance under varying degrees of nonoverlap. A largescale ridehailing application shows that our method achieves substantial gains over existing baselines, matching the performance of models trained with significantly more RCT data.
Paper Structure (57 sections, 22 theorems, 169 equations, 2 figures, 2 tables, 1 algorithm)

This paper contains 57 sections, 22 theorems, 169 equations, 2 figures, 2 tables, 1 algorithm.

Key Result

Proposition 2.5

Let $\mathcal{D}$ be a class of measurable test functions on $\mathcal{Z}\times\mathcal{T}$ and let $\mathrm{IPM}_{\mathcal{D}}(\cdot,\cdot)$ denote the associated integral probability metric. The representation $\phi$ is said to recover overlap if there exists $\varepsilon_{\mathrm{ov}}>0$ such tha even though the original covariate space may exhibit violated overlap in the sense that $\mathrm{su

Figures (2)

  • Figure 1: Support Mismatch Across Different Overlap Levels (Two Scenarios)
  • Figure 2: Robustness under Severe Conditional Non-overlap

Theorems & Definitions (57)

  • Definition 2.1: Marginal (irreducible) structural non-overlap
  • Definition 2.2: Conditional (recoverable) structural non-overlap
  • Remark 2.3: Recoverable versus irreducible non-overlap
  • Definition 2.4: Representation Property for Joint Estimation
  • Proposition 2.5: Overlap Recovery
  • Proposition 2.6: Outcome-Relevant Information Preservation
  • Remark 2.7
  • Lemma 3.1: Baseline invariance of experimental moments
  • Theorem 3.4: Separation from weighted-loss fusion in the fusion regime
  • Definition 3.5: Feasibility gap
  • ...and 47 more