Table of Contents
Fetching ...

Planning for gold: Hypothesis screening with split samples for valid powerful testing in matched observational studies

William Bekerman, Abhinandan Dalal, Carlo del Ninno, Dylan S. Small

TL;DR

This work develops a data-splitting design for observational studies to screen hypotheses in a planning sample while preserving valid inference in an analysis sample. Leveraging Rosenbaum’s sensitivity framework and the sensitivity value, the authors introduce Sens-Val, a bootstrap-assisted procedure that screens outcomes with robustness to unmeasured confounding and constructs predictive intervals for analysis-stage testing. Theoretical results (including Edgeworth expansions and local-power analyses) justify the finite-sample validity and power advantages, especially under higher bias levels, and simulations show Sens-Val often outperforms naive screening and full-sample corrections. The Bangladesh floods application demonstrates practical gains in identifying robust health, water, and economic effects under varying levels of unmeasured confounding, while maintaining FWER control. The approach offers flexible extensions to full matching, cross-screening, and alternative error-rate metrics, making it a pragmatic tool for causal inference in observational studies with many outcomes.

Abstract

Observational studies are valuable tools for inferring causal effects in the absence of controlled experiments. However, these studies may be biased due to the presence of some relevant, unmeasured set of covariates. One approach to mitigate this concern is to identify hypotheses likely to be more resilient to hidden biases by splitting the data into a planning sample for designing the study and an analysis sample for making inferences. We devise a powerful and flexible method for selecting hypotheses in the planning sample when an unknown number of outcomes are affected by the treatment, allowing researchers to gain the benefits of exploratory analysis and still conduct powerful inference under concerns of unmeasured confounding. We investigate the theoretical properties of our method and conduct extensive simulations that demonstrate pronounced benefits, especially at higher levels of allowance for unmeasured confounding. Finally, we demonstrate our method in an observational study of the multi-dimensional impacts of a devastating flood in Bangladesh.

Planning for gold: Hypothesis screening with split samples for valid powerful testing in matched observational studies

TL;DR

This work develops a data-splitting design for observational studies to screen hypotheses in a planning sample while preserving valid inference in an analysis sample. Leveraging Rosenbaum’s sensitivity framework and the sensitivity value, the authors introduce Sens-Val, a bootstrap-assisted procedure that screens outcomes with robustness to unmeasured confounding and constructs predictive intervals for analysis-stage testing. Theoretical results (including Edgeworth expansions and local-power analyses) justify the finite-sample validity and power advantages, especially under higher bias levels, and simulations show Sens-Val often outperforms naive screening and full-sample corrections. The Bangladesh floods application demonstrates practical gains in identifying robust health, water, and economic effects under varying levels of unmeasured confounding, while maintaining FWER control. The approach offers flexible extensions to full matching, cross-screening, and alternative error-rate metrics, making it a pragmatic tool for causal inference in observational studies with many outcomes.

Abstract

Observational studies are valuable tools for inferring causal effects in the absence of controlled experiments. However, these studies may be biased due to the presence of some relevant, unmeasured set of covariates. One approach to mitigate this concern is to identify hypotheses likely to be more resilient to hidden biases by splitting the data into a planning sample for designing the study and an analysis sample for making inferences. We devise a powerful and flexible method for selecting hypotheses in the planning sample when an unknown number of outcomes are affected by the treatment, allowing researchers to gain the benefits of exploratory analysis and still conduct powerful inference under concerns of unmeasured confounding. We investigate the theoretical properties of our method and conduct extensive simulations that demonstrate pronounced benefits, especially at higher levels of allowance for unmeasured confounding. Finally, we demonstrate our method in an observational study of the multi-dimensional impacts of a devastating flood in Bangladesh.
Paper Structure (42 sections, 10 theorems, 59 equations, 7 figures, 5 tables)

This paper contains 42 sections, 10 theorems, 59 equations, 7 figures, 5 tables.

Key Result

Proposition 1

Sample splitting for hypothesis screening satisfies:

Figures (7)

  • Figure 1: Simulation results in the large sample setting. (A) Results for the NUC setting. (B) Results for the UC setting.
  • Figure 2: Simulation results in the data-inspired setting. (A) Results for the NUC setting. (B) Results for the UC setting.
  • Figure 3: Simulation results for the data-inspired UC setting varying $\alpha_{\text{coverage}}$, with multiple values of $r$ and $\Gamma$.
  • Figure 4: Simulation results for the data-inspired UC setting with multiple values of $r$ and $\Gamma$.
  • Figure 5: Simulation results comparing Naive method and Bonferroni correction with multiple non-null proportions and varying number of hypotheses $L$.
  • ...and 2 more figures

Theorems & Definitions (11)

  • Proposition 1
  • proof
  • Theorem 1
  • Corollary 1
  • Proposition 2
  • Proposition 3
  • Theorem 2
  • Proposition 4
  • Theorem 3
  • Theorem 4
  • ...and 1 more