Table of Contents
Fetching ...

Budgeted Active Experimentation for Treatment Effect Estimation from Observational and Randomized Data

Jiacan Gao, Xinyan Su, Mingyuan Ma, Yiyan Huang, Xiao Xu, Xinrui Wan, Tianqi Gu, Enyun Yu, Jiecheng Guo, Zhiheng Zhang

TL;DR

This work proposes a budgeted active experimentation framework that iteratively enhances model training for causal effect estimation via active sampling, and develops an acquisition function targeting uplift estimation uncertainty, overlap deficits, and domain discrepancy to select the most informative units for randomized experiments.

Abstract

Estimating heterogeneous treatment effects is central to data-driven decision-making, yet industrial applications often face a fundamental tension between limited randomized controlled trial (RCT) budgets and abundant but biased observational data collected under historical targeting policies. Although observational logs offer the advantage of scale, they inherently suffer from severe policyinduced imbalance and overlap violations, rendering standalone estimation unreliable. We propose a budgeted active experimentation framework that iteratively enhances model training for causal effect estimation via active sampling. By leveraging observational priors, we develop an acquisition function targeting uplift estimation uncertainty, overlap deficits, and domain discrepancy to select the most informative units for randomized experiments. We establish finite-sample deviation bounds, asymptotic normality via martingale Central Limit Theorems (CLTs), and minimax lower bounds to prove information-theoretic optimality. Extensive experiments on industrial datasets demonstrate that our approach significantly outperforms standard randomized baselines in cost-constrained settings.

Budgeted Active Experimentation for Treatment Effect Estimation from Observational and Randomized Data

TL;DR

This work proposes a budgeted active experimentation framework that iteratively enhances model training for causal effect estimation via active sampling, and develops an acquisition function targeting uplift estimation uncertainty, overlap deficits, and domain discrepancy to select the most informative units for randomized experiments.

Abstract

Estimating heterogeneous treatment effects is central to data-driven decision-making, yet industrial applications often face a fundamental tension between limited randomized controlled trial (RCT) budgets and abundant but biased observational data collected under historical targeting policies. Although observational logs offer the advantage of scale, they inherently suffer from severe policyinduced imbalance and overlap violations, rendering standalone estimation unreliable. We propose a budgeted active experimentation framework that iteratively enhances model training for causal effect estimation via active sampling. By leveraging observational priors, we develop an acquisition function targeting uplift estimation uncertainty, overlap deficits, and domain discrepancy to select the most informative units for randomized experiments. We establish finite-sample deviation bounds, asymptotic normality via martingale Central Limit Theorems (CLTs), and minimax lower bounds to prove information-theoretic optimality. Extensive experiments on industrial datasets demonstrate that our approach significantly outperforms standard randomized baselines in cost-constrained settings.
Paper Structure (43 sections, 17 theorems, 106 equations, 2 figures, 5 tables, 2 algorithms)

This paper contains 43 sections, 17 theorems, 106 equations, 2 figures, 5 tables, 2 algorithms.

Key Result

Lemma 4.4

Under Assumptions assump:predictable--assump:rct_bound, $|\widetilde{Y}_t| \;\le\; \max\!\left\{\frac{1}{f_{\min}},\,\frac{1}{1-f_{\max}}\right\} \;\triangleq\; L_p.$

Figures (2)

  • Figure 1: Overview of the proposed Budgeted Active Experimentation framework for OBS-RCT fusion. We first train a CATE learner using abundant observational logs, then iteratively select a small batch of units from an unlabeled candidate pool $\mathcal{D}_{\mathrm{pool}}$ for randomized experiments under a fixed budget. At each iteration, a multi-criteria acquisition function $S(u)$ scores and ranks candidates by balancing three signals: epistemic uncertainty ($v_u$), domain discrepancy ($d_u$), and overlap deficit ($o_u$). We select the top-ranked units to run RCTs, add the new outcomes to the labeled set, and update the learner to reduce CATE estimation error.
  • Figure 2: Performance comparison of Active Learning (AL) versus Random Sampling (Rand) strategies across varying RCT sample sizes (10k to 500k). The curves display the AUUC scores for DRCFR, DESCN, and DragonNet models. The experiments are conducted on the Biased Observational Set $\mathcal{D}_{\text{obs}}^{\text{bias}}$, demonstrating the superior data efficiency of active sampling (solid lines) compared to random sampling (dashed lines) under distribution shifts.

Theorems & Definitions (28)

  • Lemma 4.4: Unbiased pseudo-outcome under adaptive sampling
  • Theorem 4.5: Finite-sample unbiasedness and deviation bound
  • Corollary 4.6: Finite-sample PEHE bound
  • Theorem 4.8: Asymptotic normality (martingale CLT)
  • Theorem 4.9: Minimax lower bound for active RCT under bounded randomization
  • Corollary 4.10: Near-minimax optimality of the orthogonalized estimator
  • Lemma 2.4: Unbiased pseudo-outcome under adaptive sampling
  • proof
  • Lemma 2.5: Conditional sub-Gaussianity of $\varepsilon_t$
  • proof
  • ...and 18 more