Towards Robust Matched Observational Studies with General Treatment Types: Consistency, Efficiency, and Adaptivity

Siyu Heng; Elaine K. Chiu; Hyunseung Kang

Towards Robust Matched Observational Studies with General Treatment Types: Consistency, Efficiency, and Adaptivity

Siyu Heng, Elaine K. Chiu, Hyunseung Kang

TL;DR

This paper addresses robustness of causal inferences from matched observational studies when treatments are non-binary by developing a treatment-agnostic sensitivity-analysis framework. It introduces a universal sensitivity parameter $\overline{\Gamma}$ and generalizes design sensitivity and Bahadur-Rosenbaum efficiency to binary, ordinal, and continuous treatments, enabling meaningful cross-treatment comparisons. A negative result shows that dichotomizing continuous treatments yields invalid sensitivity bounds, motivating the generalized framework and the adaptive testing procedure that combines two candidate statistics to achieve robustness. The proposed methods are demonstrated through extensive simulations across multiple dose–response shapes and a real data application on tobacco exposure and lung function, showing that no single test dominates and that adaptive testing provides robust performance. Overall, the work significantly extends sensitivity-analysis tools to general treatments, offering practical guidance for robust causal inference in dose-response and ordinal settings.

Abstract

To ensure reliable causal conclusions from observational (i.e., non-randomized) studies, researchers routinely conduct sensitivity analysis to assess robustness to hidden bias due to unmeasured confounding. In matched observational studies (one of the most widely used observational study designs), two foundational concepts, design sensitivity and Bahadur-Rosenbaum efficiency, are used to quantify the robustness of test statistics and study designs in sensitivity analyses. Unfortunately, these measures of robustness are not developed for non-binary treatments (e.g., continuous or ordinal treatments) and consequently, prevailing recommendations about robust tests may be misleading. In this work, we provide a unified framework to quantify robustness of test statistics and study designs that are agnostic to treatment types. We first present a negative result about a popular, ad-hoc approach based on dichotomizing the treatment variable. Next, we introduce a universal, nearly sufficient sensitivity parameter that is agnostic to the underlying treatment type. We then generalize and derive all-in-one formulas for design sensitivity and Bahadur-Rosenbaum efficiency that can be used for any treatment type. We also propose a general data-adaptive approach to combine candidate test statistics to enhance robustness against unmeasured confounding. Extensive simulation studies and a data application illustrate our proposed framework. For practice, our results yield new, previously undiscovered insights about the robustness of tests and study designs in matched observational studies, especially when investigators are faced with non-binary treatment.sed sensitivity analysis for the binary treatment case, built on the generalized Rosenbaum sensitivity bounds and large-scale mixed integer programming.

Towards Robust Matched Observational Studies with General Treatment Types: Consistency, Efficiency, and Adaptivity

TL;DR

and generalizes design sensitivity and Bahadur-Rosenbaum efficiency to binary, ordinal, and continuous treatments, enabling meaningful cross-treatment comparisons. A negative result shows that dichotomizing continuous treatments yields invalid sensitivity bounds, motivating the generalized framework and the adaptive testing procedure that combines two candidate statistics to achieve robustness. The proposed methods are demonstrated through extensive simulations across multiple dose–response shapes and a real data application on tobacco exposure and lung function, showing that no single test dominates and that adaptive testing provides robust performance. Overall, the work significantly extends sensitivity-analysis tools to general treatments, offering practical guidance for robust causal inference in dose-response and ordinal settings.

Abstract

Paper Structure (18 sections, 18 theorems, 88 equations, 3 figures, 8 tables)

This paper contains 18 sections, 18 theorems, 88 equations, 3 figures, 8 tables.

Introduction
Background: Matched Observational Studies and Measuring Robustness Under Unmeasured Confounding With Test Consistency and Efficiency
Motivation: Sensitivity Analysis with Non-Binary Treatments and Failure of Ad-Hoc Procedures Based on Dichotomization
Our Contributions: A Unified Sensitivity Analysis for Binary and Non-Binary Treatment
Review
Matched Observational Studies
Rosenbaum Sensitivity Model
Power of a Sensitivity Analysis, Design Sensitivity and Bahadur-Rosenbaum Efficiency Under Binary Treatments
A Universal Framework for Demystifying Robustness of Matched Observational Studies under General Treatments
A Negative Result: Inconsistency of Sensitivity Bounds Based on Dichotomization
Generalized Design Sensitivity: Concept and Formulas
Generalized Bahadur-Rosenbaum Efficiency
Simulation Studies: Robust Designs for Observational Studies With Continuous Treatments
An Adaptive Approach for Combining Test Statistics under General Treatment Types
Methodology and Properties
...and 3 more sections

Key Result

Theorem 3.1

(Inapplicability of the Dichotomization Strategy for Continuous Treatments) When the treatment is continuous and is confounded by some unmeasured confounder $u$, the canonical Rosenbaum sensitivity bounds (i.e., when all $\Gamma_{i}$ in (eqn: Rosenbaum bounds) collapse to the same value $\Gamma$), w

Figures (3)

Figure 1: An illustration of the generalized design sensitivity and Bahadur-Rosenbaum relative efficiency in sensitivity analysis. Figure (a) illustrates the generalized design sensitivities of two tests where test statistic $T_2$ has a higher generalized design sensitivity than test statistic $T_1$ (i.e., $\overline{\Gamma}_{*,1} < \overline{\Gamma}_{*, 2}$). Figure (b) illustrates the generalized Bahadur-Rosenbaum relative efficiency $\Upsilon_2/\Upsilon_1$ for a fixed sensitivity parameter $\overline{\Gamma}$, where $T_2$ is more efficient than $T_1$ and the generalized Bahadur-Rosenbaum relative efficiency captures the ratio of the minimal required numbers of matched pairs for $T_1$ or $T_2$ to achieve a specific power $\beta$ under some significance level $\alpha$.
Figure 2: The dose-response curves and the corresponding generalized design sensitivities across the four competing test statistics.
Figure 3: Dose--response of ETS exposure and pulmonary function. Y-axis: FEV1/FVC ratio. X-axis: serum cotinine after a Box--Cox transform ($\lambda \approx -0.02$; original units ng/mL). The blue line is the ordinary least squares (OLS) fit of FEV1/FVC on transformed cotinine; its negative slope indicates that higher ETS exposure is associated with lower FEV1/FVC.

Theorems & Definitions (37)

Theorem 3.1
Proposition 3.2: Sufficiency of the Generalized Sensitivity Parameter $\overline{\Gamma}$
Definition 3.3: Generalized Design Sensitivity
Theorem 3.4: Generalized Design Sensitivity Formula Under Continuous Treatments
Theorem 3.5: Generalized Bahadur-Rosenbaum Exact Slope
Corollary 3.6: Generalized Bahadur-Rosenbaum Efficiency
Theorem 4.1: Type I Error Rate of Adaptive Testing Procedure
Theorem 4.2: Adaptivity of the Adaptive Testing Procedure
proof
proof
...and 27 more

Towards Robust Matched Observational Studies with General Treatment Types: Consistency, Efficiency, and Adaptivity

TL;DR

Abstract

Towards Robust Matched Observational Studies with General Treatment Types: Consistency, Efficiency, and Adaptivity

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (37)