Active clustering with bandit feedback
Victor Thuot, Alexandra Carpentier, Christophe Giraud, Nicolas Verzelen
TL;DR
This work tackles ACP, where N arms with d-dimensional subGaussian means are partitioned into K hidden groups and the goal is δ-PAC exact recovery with minimal budget τ. The authors establish a non-asymptotic lower bound on the minimal budget that separates a dimension-free term and a high-dimensional term, and introduce ACB, a computationally efficient algorithm whose budget matches the lower bound in many high-dimensional regimes. ACB decomposes the task into Sequential Representatives Identification (SRI) and Active Distance-based Classification (ADC), with ACB achieving near-optimal budgets and no computation-information gap in the active setting. An adaptive variant, ACB^*, handles unknown Δ_* and θ_* via a multiscale search, and numerical experiments on high-dimensional synthetic data validate the theoretical gains and δ-PAC guarantees. Overall, the paper advances understanding of efficient active clustering in high dimensions and demonstrates practical gains over batch or uniform sampling approaches.
Abstract
We investigate the Active Clustering Problem (ACP). A learner interacts with an $N$-armed stochastic bandit with $d$-dimensional subGaussian feedback. There exists a hidden partition of the arms into $K$ groups, such that arms within the same group, share the same mean vector. The learner's task is to uncover this hidden partition with the smallest budget - i.e., the least number of observation - and with a probability of error smaller than a prescribed constant $δ$. In this paper, (i) we derive a non-asymptotic lower bound for the budget, and (ii) we introduce the computationally efficient ACB algorithm, whose budget matches the lower bound in most regimes. We improve on the performance of a uniform sampling strategy. Importantly, contrary to the batch setting, we establish that there is no computation-information gap in the active setting.
