Kernel Treatment Effects with Adaptively Collected Data

Houssam Zenati; Bariscan Bozkurt; Arthur Gretton

Kernel Treatment Effects with Adaptively Collected Data

Houssam Zenati, Bariscan Bozkurt, Arthur Gretton

TL;DR

This work develops the first kernel-based framework for distributional causal inference under adaptive data collection. It introduces kernel treatment effects (KTE) defined via counterfactual mean embeddings in an RKHS and derives a variance-stabilized, doubly robust estimator that admits a Hilbert-space martingale CLT under adaptivity. A sample-split stabilized test is proposed to yield valid Gaussian limits for testing equality of counterfactual distributions, with a practical, fold-based procedure for variance estimation and cross-fitting to handle nuisance components. Through synthetic, IHDP, and dSprite experiments, the method achieves calibrated type-I error and high power for both mean shifts and higher-moment distributional changes, outperforming mean-focused adaptive baselines. The results suggest a general strategy for distributional inference under adaptivity using variance-stabilized martingale arguments for RKHS-valued estimands.

Abstract

Adaptive experiments improve efficiency by adjusting treatment assignments based on past outcomes, but this adaptivity breaks the i.i.d. assumptions that underpins classical asymptotics. At the same time, many questions of interest are distributional, extending beyond average effects. Kernel treatment effects (KTE) provide a flexible framework by representing counterfactual outcome distributions in an RKHS and comparing them via kernel distances. We present the first kernel-based framework for distributional inference under adaptive data collection. Our method combines doubly robust scores with variance stabilization to ensure asymptotic normality via a Hilbert-space martingale CLT, and introduces a sample-fitted stabilized test with valid type-I error. Experiments show it is well calibrated and effective for both mean shifts and higher-moment differences, outperforming adaptive baselines limited to scalar effects.

Kernel Treatment Effects with Adaptively Collected Data

TL;DR

Abstract

Kernel Treatment Effects with Adaptively Collected Data

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (33)