The Traveling Bandit: A Framework for Bayesian Optimization with Movement Costs
Qiyuan Chen, Raed Al Kontar
TL;DR
This work addresses Bayesian Optimization in settings where changing inputs incurs movement costs, by introducing a plug-in framework that first selects a batch of candidate designs via a standard BO acquisition function and then visits them along a shortest-path tour computed by a Traveling Salesman Problem. The key idea is to decouple function optimization from movement planning, enabling sublinear growth of movement costs $C_T$ while preserving sublinear regret $R_T$, with overall loss $L_T=R_T+C_T$. The authors establish a general path-length bound on tours in metric spaces using Minkowski dimension, and show that batched algorithms with $N(T)=o(T^{1/d})$ batches achieve convergence in both movement costs and regret, with concrete results for BO and other stochastic bandits (MAB and Lipschitz bandits). Empirically, the method reduces movement costs over time across multiple test functions without sacrificing regret, and is shown to be compatible with various acquisition functions and batched strategies. This framework has broad applicability to online decision-making problems where input changes carry non-negligible costs.
Abstract
This paper introduces a framework for Bayesian Optimization (BO) with metric movement costs, addressing a critical challenge in practical applications where input alterations incur varying costs. Our approach is a convenient plug-in that seamlessly integrates with the existing literature on batched algorithms, where designs within batches are observed following the solution of a Traveling Salesman Problem. The proposed method provides a theoretical guarantee of convergence in terms of movement costs for BO. Empirically, our method effectively reduces average movement costs over time while maintaining comparable regret performance to conventional BO methods. This framework also shows promise for broader applications in various bandit settings with movement costs.
