Functional Sequential Treatment Allocation with Covariates

Anders Bredahl Kock; David Preinerstorfer; Bezirgen Veliyev

Functional Sequential Treatment Allocation with Covariates

Anders Bredahl Kock, David Preinerstorfer, Bezirgen Veliyev

TL;DR

This work addresses sequential treatment allocation with covariates when the objective is a general functional $\mathsf{T}$ of the conditional outcome distribution rather than the mean. It introduces the Functional Upper Confidence Bound (F-UCB) policy with covariates, implemented via covariate-space binning to estimate conditional functionals $F^i(\cdot, x)$ within each bin and select arms by maximizing $\mathsf{T}(F^i(\cdot, x))$. Under Hölder equicontinuity of the conditional distributions and a margin condition, the authors prove sublinear, near-minimax regret bounds and show that ignoring covariates leads to linear regret. The results extend prior functional-target bandit theory to covariate settings, providing adaptivity to arm similarity and ethical guarantees on exploration, with lower bounds matching the upper bounds up to logarithmic factors.

Abstract

We consider a multi-armed bandit problem with covariates. Given a realization of the covariate vector, instead of targeting the treatment with highest conditional expectation, the decision maker targets the treatment which maximizes a general functional of the conditional potential outcome distribution, e.g., a conditional quantile, trimmed mean, or a socio-economic functional such as an inequality, welfare or poverty measure. We develop expected regret lower bounds for this problem, and construct a near minimax optimal assignment policy.

Functional Sequential Treatment Allocation with Covariates

TL;DR

This work addresses sequential treatment allocation with covariates when the objective is a general functional

of the conditional outcome distribution rather than the mean. It introduces the Functional Upper Confidence Bound (F-UCB) policy with covariates, implemented via covariate-space binning to estimate conditional functionals

within each bin and select arms by maximizing

. Under Hölder equicontinuity of the conditional distributions and a margin condition, the authors prove sublinear, near-minimax regret bounds and show that ignoring covariates leads to linear regret. The results extend prior functional-target bandit theory to covariate settings, providing adaptivity to arm similarity and ethical guarantees on exploration, with lower bounds matching the upper bounds up to logarithmic factors.

Functional Sequential Treatment Allocation with Covariates

TL;DR

Abstract

Functional Sequential Treatment Allocation with Covariates

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Theorems & Definitions (18)