Table of Contents
Fetching ...

Sharp Bounds for Treatment Effect Generalization under Outcome Distribution Shift

Amir Asiaee, Samhita Pal, Cole Beck, Jared D. Huling

TL;DR

The paper addresses generalizing randomized trial effects to a target population when outcome transportability may fail due to unmeasured effect modifiers. It introduces an outcome-shift sensitivity model that bounds the conditional outcome density ratio by a scalar $\Lambda$, yielding sharp, one-dimensional identified sets for the target ATE $\tau^o$ and a simple $O(n \log n)$ greedy algorithm to compute finite-sample bounds. The bounds are shown to be sharp and consistently estimable, with simulations demonstrating nominal coverage when the true shift lies within $\Lambda$ and substantial gains in informativeness over worst-case bounds. The approach blends ideas from marginal sensitivity models and generalization literature, providing a scalable, distributional, and interpretable tool for inference under transportability violations with broad practical impact.

Abstract

Generalizing treatment effects from a randomized trial to a target population requires the assumption that potential outcome distributions are invariant across populations after conditioning on observed covariates. This assumption fails when unmeasured effect modifiers are distributed differently between trial participants and the target population. We develop a sensitivity analysis framework that bounds how much conclusions can change when this transportability assumption is violated. Our approach constrains the likelihood ratio between target and trial outcome densities by a scalar parameter $Λ\geq 1$, with $Λ= 1$ recovering standard transportability. For each $Λ$, we derive sharp bounds on the target average treatment effect -- the tightest interval guaranteed to contain the true effect under all data-generating processes compatible with the observed data and the sensitivity model. We show that the optimal likelihood ratios have a simple threshold structure, leading to a closed-form greedy algorithm that requires only sorting trial outcomes and redistributing probability mass. The resulting estimator runs in $O(n \log n)$ time and is consistent under standard regularity conditions. Simulations demonstrate that our bounds achieve nominal coverage when the true outcome shift falls within the specified $Λ$, provide substantially tighter intervals than worst-case bounds, and remain informative across a range of realistic violations of transportability.

Sharp Bounds for Treatment Effect Generalization under Outcome Distribution Shift

TL;DR

The paper addresses generalizing randomized trial effects to a target population when outcome transportability may fail due to unmeasured effect modifiers. It introduces an outcome-shift sensitivity model that bounds the conditional outcome density ratio by a scalar , yielding sharp, one-dimensional identified sets for the target ATE and a simple greedy algorithm to compute finite-sample bounds. The bounds are shown to be sharp and consistently estimable, with simulations demonstrating nominal coverage when the true shift lies within and substantial gains in informativeness over worst-case bounds. The approach blends ideas from marginal sensitivity models and generalization literature, providing a scalable, distributional, and interpretable tool for inference under transportability violations with broad practical impact.

Abstract

Generalizing treatment effects from a randomized trial to a target population requires the assumption that potential outcome distributions are invariant across populations after conditioning on observed covariates. This assumption fails when unmeasured effect modifiers are distributed differently between trial participants and the target population. We develop a sensitivity analysis framework that bounds how much conclusions can change when this transportability assumption is violated. Our approach constrains the likelihood ratio between target and trial outcome densities by a scalar parameter , with recovering standard transportability. For each , we derive sharp bounds on the target average treatment effect -- the tightest interval guaranteed to contain the true effect under all data-generating processes compatible with the observed data and the sensitivity model. We show that the optimal likelihood ratios have a simple threshold structure, leading to a closed-form greedy algorithm that requires only sorting trial outcomes and redistributing probability mass. The resulting estimator runs in time and is consistent under standard regularity conditions. Simulations demonstrate that our bounds achieve nominal coverage when the true outcome shift falls within the specified , provide substantially tighter intervals than worst-case bounds, and remain informative across a range of realistic violations of transportability.
Paper Structure (77 sections, 4 theorems, 42 equations, 15 figures, 7 tables)

This paper contains 77 sections, 4 theorems, 42 equations, 15 figures, 7 tables.

Key Result

lemma 1

Fix $x$ and $a$, and suppose $Y \mid (A = a, X = x, S = r)$ has support contained in a bounded interval $[L, U]$. Any maximizer of eq:mu-plus-def takes the threshold form $r_a^\star(x, y) = \Lambda$ for $y > t$, $r_a^\star(x, y) = \Lambda^{-1}$ for $y < t$, and $r_a^\star(x, t) = r_0 \in [\Lambda^{-

Figures (15)

  • Figure 1: Outcome-shift sensitivity bounds
  • Figure 2: Sensitivity envelopes for DGPs 1--4. Each panel shows the sharp bound interval $[\hat{\tau}^{o,-}(\Lambda), \hat{\tau}^{o,+}(\Lambda)]$ as $\Lambda$ varies, with the true target ATE marked. Binary outcomes (DGP 3) yield tighter bounds.
  • Figure 3: Coverage (left) and mean width (right) vs. $\Lambda$ for DGP 1 ($R = 1000$). Coverage transitions from undercoverage at $\Lambda = 1$ to nominal levels by $\Lambda \approx 1.5$. Oracle width (dashed) confirms sharpness.
  • Figure 4: Comparison at $\Lambda = 2.0$ for DGP 1. Naive estimator and bootstrap CI fail to cover; worst-case bounds are uninformatively wide.
  • Figure 5: Identification vs. estimation (DGP 1). As $n^r$ grows, naive bootstrap CI shrinks but coverage deteriorates (left). Sharp bounds maintain coverage with stable widths (right).
  • ...and 10 more figures

Theorems & Definitions (5)

  • definition 1: Outcome-shift sensitivity model
  • lemma 1: Threshold structure
  • theorem 2: Sharp bounds for the target ATE
  • theorem 3: Greedy algorithm
  • theorem 4: Consistency