Blessings of Multiple Good Arms in Multi-Objective Linear Bandits

Heesang Ann; Min-hwan Oh

Blessings of Multiple Good Arms in Multi-Objective Linear Bandits

Heesang Ann, Min-hwan Oh

TL;DR

This is the first study to introduce implicit exploration in both multi objective and parametric bandit settings without any distributional assumptions on the contexts, and introduces a framework for effective Pareto fairness, which provides a principled approach to rigorously analyzing fairness of multi objective bandit algorithms.

Abstract

The multi objective bandit setting has traditionally been regarded as more complex than the single objective case, as multiple objectives must be optimized simultaneously. In contrast to this prevailing view, we demonstrate that when multiple good arms exist for multiple objectives, they can induce a surprising benefit, implicit exploration. Under this condition, we show that simple algorithms that greedily select actions in most rounds can nonetheless achieve strong performance, both theoretically and empirically. To our knowledge, this is the first study to introduce implicit exploration in both multi objective and parametric bandit settings without any distributional assumptions on the contexts. We further introduce a framework for effective Pareto fairness, which provides a principled approach to rigorously analyzing fairness of multi objective bandit algorithms.

Blessings of Multiple Good Arms in Multi-Objective Linear Bandits

TL;DR

Abstract

Paper Structure (76 sections, 36 theorems, 97 equations, 12 figures, 1 table, 4 algorithms)

This paper contains 76 sections, 36 theorems, 97 equations, 12 figures, 1 table, 4 algorithms.

Introduction
Contributions
Related work
Multi-objective bandit.
Free exploration.
Problem settings
Notations
Multi-objective linear bandits
Effective Pareto regret
Effective Pareto fairness
Beyond effective Pareto regret.
Existing fairness criterion Drugan2013.
Effective Pareto fairness.
Pareto front approximation.
Proposed algorithm
...and 61 more sections

Key Result

Proposition 1

For any $a_*\in\mathcal{C}^*$, there exist $w \in \mathbb{S}^{M-1}$ satisfying $a_* = \arg\max_{i\in[K]} w^\top\mu_i$. Conversely, for any $w \in \mathbb{S}^{M-1}$, if $a_* = \arg\max_{i\in[K]} w^\top\mu_i$ is a unique arm, then $a_*\in\mathcal{C}^*$.

Figures (12)

Figure 1: Evaluation of multi-objective bandit algorithms in the fixed-feature setting. The plots in the left two columns report the performance of the algorithms, while the plots in the rightmost column report the running time. The top row shows results for $d=5$, $K=50$, $M=5$, and the bottom row shows results for $d=10$, $K=200$, $M=10$.
Figure 4.1: The larger circle represents the unit sphere in $\mathbb{R}^d$ while the interior of smaller circle indicates the region where $\widetilde{\theta}(s)$ may exist. Then, the blue line illustrates the distance between $\theta_{m}^*$ and the $\gamma$-good arm for $\widetilde{\theta}(s)$.
Figure 7.1: The interior of the circle with radius ${x_{\max} \over \gamma}$ represents the region where $x \over \gamma$ may exist in $\mathbb{R}^d$, while that of the smallest circle indicates the region where $\widetilde{\theta}(s)$ may exist. Then, the blue line illustrates the case when $x \over \gamma$ is farthest from the $\theta_{m}^*$.
Figure 7.2: The larger circle represents the unit sphere in $\mathbb{R}^d$ while the interior of the smallest circle indicates the region where $\widetilde{\theta}(s)$ may exist. Then, the blue line illustrates the case when $x$ is farthest from the $\theta_{m}^* \over \|\theta_{m}^* \|_2$.
Figure 9.1: Problem space $\Theta$ construction when $d=2$.
...and 7 more figures

Theorems & Definitions (57)

Definition 1: Pareto order
Definition 2: Effective Pareto front
Proposition 1: Theorem 1 of park2025thompson
Definition 3: Effective Pareto regret
Definition 4: Effective Pareto fairness
Remark 1
Definition 5: Goodness of arms
Remark 2
Remark 3
Definition 6: Regularity indices of a distribution
...and 47 more

Blessings of Multiple Good Arms in Multi-Objective Linear Bandits

TL;DR

Abstract

Blessings of Multiple Good Arms in Multi-Objective Linear Bandits

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (57)