Table of Contents
Fetching ...

A Sensitivity Approach to Causal Inference Under Limited Overlap

Yuanzhe Ma, Hongseok Namkoong

TL;DR

This work introduces a finite-sample minimax framework for causal inference under limited overlap, separating overlap-driven estimation from extrapolation-required non-overlap regions. By imposing Lipschitz smoothness on the outcome function and leveraging Donoho's modulus of continuity, the authors derive minimax confidence intervals that bound the bias introduced by trimming or reweighting in non-overlap areas. The MP_ε (and its Lipschitz-parameterized variant MP_ε,L) intervals provide reliable, instance-specific uncertainty quantification, while a combined MP_combine approach preserves full ATE coverage with improved efficiency. Empirical demonstrations on simulated and PennUI data show that traditional asymptotic methods can underperform under poor overlap, whereas the proposed sensitivity framework offers robust diagnostics and data-collection guidance through confidence sequences. Overall, the paper delivers a practical, interpretable tool for robust causal inference in the presence of limited overlap with potential extensions to continual sampling and more complex treatment spaces.

Abstract

Limited overlap between treated and control groups is a key challenge in observational analysis. Standard approaches like trimming importance weights can reduce variance but introduce a fundamental bias. We propose a sensitivity framework for contextualizing findings under limited overlap, where we assess how irregular the outcome function has to be in order for the main finding to be invalidated. Our approach is based on worst-case confidence bounds on the bias introduced by standard trimming practices, under explicit assumptions necessary to extrapolate counterfactual estimates from regions of overlap to those without. Empirically, we demonstrate how our sensitivity framework protects against spurious findings by quantifying uncertainty in regions with limited overlap.

A Sensitivity Approach to Causal Inference Under Limited Overlap

TL;DR

This work introduces a finite-sample minimax framework for causal inference under limited overlap, separating overlap-driven estimation from extrapolation-required non-overlap regions. By imposing Lipschitz smoothness on the outcome function and leveraging Donoho's modulus of continuity, the authors derive minimax confidence intervals that bound the bias introduced by trimming or reweighting in non-overlap areas. The MP_ε (and its Lipschitz-parameterized variant MP_ε,L) intervals provide reliable, instance-specific uncertainty quantification, while a combined MP_combine approach preserves full ATE coverage with improved efficiency. Empirical demonstrations on simulated and PennUI data show that traditional asymptotic methods can underperform under poor overlap, whereas the proposed sensitivity framework offers robust diagnostics and data-collection guidance through confidence sequences. Overall, the paper delivers a practical, interpretable tool for robust causal inference in the presence of limited overlap with potential extensions to continual sampling and more complex treatment spaces.

Abstract

Limited overlap between treated and control groups is a key challenge in observational analysis. Standard approaches like trimming importance weights can reduce variance but introduce a fundamental bias. We propose a sensitivity framework for contextualizing findings under limited overlap, where we assess how irregular the outcome function has to be in order for the main finding to be invalidated. Our approach is based on worst-case confidence bounds on the bias introduced by standard trimming practices, under explicit assumptions necessary to extrapolate counterfactual estimates from regions of overlap to those without. Empirically, we demonstrate how our sensitivity framework protects against spurious findings by quantifying uncertainty in regions with limited overlap.

Paper Structure

This paper contains 36 sections, 7 theorems, 150 equations, 16 figures, 2 tables, 1 algorithm.

Key Result

Lemma 1

Let $\mu \ge 0$ and $\Lambda \ge 0$ be the optimal dual variables corresponding to the Lipschitz constraints eqn:def-F-L-Lip-finite-sample. The minimax estimator imputes counterfactual values as

Figures (16)

  • Figure 1: Left: data-generation process used in the simulation setup where $\pi(x) = \mathbb{P}(Z = 1 \mid X = x)$ denotes the propensity score, $q(x) = \min \left\{\pi(x),1-\pi(x)\right\}$ measures whether a point has sufficient overlap, and $f(x, z)$ represents the potential outcome for a unit with covariates $x$ under treatment assignment $z \in \left\{0, 1\right\}$. The individual treatment effect is defined as $\tau(x) = f(x, 1) - f(x, 0)$. Right: Visualization of one simulated observational dataset.
  • Figure 2: Confidence intervals from $\mathsf{AIPW}$ (left) and its trimmed variant $\mathsf{AIPW}_{\mathsf{partial}}$ (right) across different overlap levels, with the dotted red line representing the true estimand value; higher values on the $x$-axis mean more limited overlap. Left: $\mathsf{AIPW}$ yields very wide confidence intervals. Right: We follow standard heuristics to truncate data in a way such that $\mathsf{AIPW}_{\mathsf{partial}}$'s confidence interval has the smallest length.
  • Figure 3: Visualization of our method. In the overlap region, we use typical asymptotic confidence intervals. In the non-overlap region, we use the minimax approach to extrapolate from the overlap region. Our method allows the analyst to analyze the potential bias caused by ignoring samples with extreme propensity scores and see how this depends on the extrapolability of data from the non-overlap region to the overlap region.
  • Figure 4: For each point in the non-overlap region, we list the set of treated points from the overlap region used in its extrapolation. For example, the leftmost point $i$ in the non-overlap region uses points 1, 2, and 3 for extrapolation. This means that point pairs $(i,1), (i,2), (i,3)$ have binding Lipschitz constraints for the program that defines the minimax estimator $\hat{\tau}_{\delta_{\mathrm{FLCI}}}(\bm{w})$\ref{['eqn:minimax-w-interlval-expression']}.
  • Figure 5: Left: There are $2(k+1)n$ samples in total and the middle region in pink is the overlap region. See Appendix \ref{['sec:analytic-example-details']} for details. Right: RMSE of the estimator $\hat{\tau}_{\delta}(\bm{w})$ vs $\delta$ with $n=25, k = 10, L = 1, \eta = 0.1, \xi = 0.01$.
  • ...and 11 more figures

Theorems & Definitions (7)

  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Lemma 5
  • Lemma 6
  • Lemma 7