Budgeted Active Experimentation for Treatment Effect Estimation from Observational and Randomized Data

Jiacan Gao; Xinyan Su; Mingyuan Ma; Yiyan Huang; Xiao Xu; Xinrui Wan; Tianqi Gu; Enyun Yu; Jiecheng Guo; Zhiheng Zhang

Budgeted Active Experimentation for Treatment Effect Estimation from Observational and Randomized Data

Jiacan Gao, Xinyan Su, Mingyuan Ma, Yiyan Huang, Xiao Xu, Xinrui Wan, Tianqi Gu, Enyun Yu, Jiecheng Guo, Zhiheng Zhang

TL;DR

This work proposes a budgeted active experimentation framework that iteratively enhances model training for causal effect estimation via active sampling, and develops an acquisition function targeting uplift estimation uncertainty, overlap deficits, and domain discrepancy to select the most informative units for randomized experiments.

Abstract

Estimating heterogeneous treatment effects is central to data-driven decision-making, yet industrial applications often face a fundamental tension between limited randomized controlled trial (RCT) budgets and abundant but biased observational data collected under historical targeting policies. Although observational logs offer the advantage of scale, they inherently suffer from severe policyinduced imbalance and overlap violations, rendering standalone estimation unreliable. We propose a budgeted active experimentation framework that iteratively enhances model training for causal effect estimation via active sampling. By leveraging observational priors, we develop an acquisition function targeting uplift estimation uncertainty, overlap deficits, and domain discrepancy to select the most informative units for randomized experiments. We establish finite-sample deviation bounds, asymptotic normality via martingale Central Limit Theorems (CLTs), and minimax lower bounds to prove information-theoretic optimality. Extensive experiments on industrial datasets demonstrate that our approach significantly outperforms standard randomized baselines in cost-constrained settings.

Budgeted Active Experimentation for Treatment Effect Estimation from Observational and Randomized Data

TL;DR

Abstract

Paper Structure (43 sections, 17 theorems, 106 equations, 2 figures, 5 tables, 2 algorithms)

This paper contains 43 sections, 17 theorems, 106 equations, 2 figures, 5 tables, 2 algorithms.

Introduction
Problem Formulation
Methodology
Adaptive RCT execution and dataset update
Theoretical Analysis
Role of observational data in the estimator.
Unbiasedness and finite-sample error bounds
Asymptotic normality under adaptive sampling
Minimax lower bound and (near) optimality
Experiments
Conclusion
Related Work
Theoretical Analysis
Setup: adaptive RCT stream and pseudo-outcome regression
Unbiasedness and finite-sample error bounds
...and 28 more sections

Key Result

Lemma 4.4

Under Assumptions assump:predictable--assump:rct_bound, $|\widetilde{Y}_t| \;\le\; \max\!\left\{\frac{1}{f_{\min}},\,\frac{1}{1-f_{\max}}\right\} \;\triangleq\; L_p.$

Figures (2)

Figure 1: Overview of the proposed Budgeted Active Experimentation framework for OBS-RCT fusion. We first train a CATE learner using abundant observational logs, then iteratively select a small batch of units from an unlabeled candidate pool $\mathcal{D}_{\mathrm{pool}}$ for randomized experiments under a fixed budget. At each iteration, a multi-criteria acquisition function $S(u)$ scores and ranks candidates by balancing three signals: epistemic uncertainty ($v_u$), domain discrepancy ($d_u$), and overlap deficit ($o_u$). We select the top-ranked units to run RCTs, add the new outcomes to the labeled set, and update the learner to reduce CATE estimation error.
Figure 2: Performance comparison of Active Learning (AL) versus Random Sampling (Rand) strategies across varying RCT sample sizes (10k to 500k). The curves display the AUUC scores for DRCFR, DESCN, and DragonNet models. The experiments are conducted on the Biased Observational Set $\mathcal{D}_{\text{obs}}^{\text{bias}}$, demonstrating the superior data efficiency of active sampling (solid lines) compared to random sampling (dashed lines) under distribution shifts.

Theorems & Definitions (28)

Lemma 4.4: Unbiased pseudo-outcome under adaptive sampling
Theorem 4.5: Finite-sample unbiasedness and deviation bound
Corollary 4.6: Finite-sample PEHE bound
Theorem 4.8: Asymptotic normality (martingale CLT)
Theorem 4.9: Minimax lower bound for active RCT under bounded randomization
Corollary 4.10: Near-minimax optimality of the orthogonalized estimator
Lemma 2.4: Unbiased pseudo-outcome under adaptive sampling
proof
Lemma 2.5: Conditional sub-Gaussianity of $\varepsilon_t$
proof
...and 18 more

Budgeted Active Experimentation for Treatment Effect Estimation from Observational and Randomized Data

TL;DR

Abstract

Budgeted Active Experimentation for Treatment Effect Estimation from Observational and Randomized Data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (28)