Distribution-Dependent Rates for Multi-Distribution Learning

Rafael Hanashiro; Patrick Jaillet

Distribution-Dependent Rates for Multi-Distribution Learning

Rafael Hanashiro, Patrick Jaillet

TL;DR

This paper addresses MDL under distributional uncertainty by introducing distribution-dependent, gap-aware rates inspired by pure-exploration bandits. It analyzes non-adaptive strategies (Uniform Exploration and Non-Uniform Exploration) and an adaptive optimistic method (LCB-DR), deriving non-asymptotic, gap-dependent bounds that decay exponentially with the budget and can outperform distribution-independent rates. The work introduces complexity measures H_a and C_a to capture difficulty from identifying worst-case distributions, and extends to infinite decision sets via covering arguments. Empirically, it shows that NUE leverages variance across environments and that LCB-DR can significantly improve simple regret and error probabilities, highlighting practical gains in robust learning under distribution shifts.

Abstract

To address the needs of modeling uncertainty in sensitive machine learning applications, the setup of distributionally robust optimization (DRO) seeks good performance uniformly across a variety of tasks. The recent multi-distribution learning (MDL) framework tackles this objective in a dynamic interaction with the environment, where the learner has sampling access to each target distribution. Drawing inspiration from the field of pure-exploration multi-armed bandits, we provide distribution-dependent guarantees in the MDL regime, that scale with suboptimality gaps and result in superior dependence on the sample size when compared to the existing distribution-independent analyses. We investigate two non-adaptive strategies, uniform and non-uniform exploration, and present non-asymptotic regret bounds using novel tools from empirical process theory. Furthermore, we devise an adaptive optimistic algorithm, LCB-DR, that showcases enhanced dependence on the gaps, mirroring the contrast between uniform and optimistic allocation in the multi-armed bandit literature. We also conduct a small synthetic experiment illustrating the comparative strengths of each strategy.

Distribution-Dependent Rates for Multi-Distribution Learning

TL;DR

Abstract

Distribution-Dependent Rates for Multi-Distribution Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (1)

Theorems & Definitions (33)