HyperArm Bandit Optimization: A Novel approach to Hyperparameter Optimization and an Analysis of Bandit Algorithms in Stochastic and Adversarial Settings

Samih Karroum; Saad Mazhar

HyperArm Bandit Optimization: A Novel approach to Hyperparameter Optimization and an Analysis of Bandit Algorithms in Stochastic and Adversarial Settings

Samih Karroum, Saad Mazhar

TL;DR

This work analyzes bandit algorithms in stochastic and adversarial settings and introduces HyperArm Bandit Optimization (HABO), a hierarchical hyperparameter-tuning framework. HABO uses EXP3 to treat hyperparameters as super-arms and their values as sub-arms, enabling dynamic resource allocation and robustness to noise and adversarial dynamics. Theoretical results establish sublinear regret bounds for EXP3 and for HABO's hierarchical setting, while experiments on Titanic and House Prices show HABO matching or outperforming Bayesian Optimization in classification and often achieving faster convergence. The findings suggest HABO as a scalable, no-regret alternative for hyperparameter optimization in noisy or evolving environments, with potential extensions to other adversarial-bandit algorithms and ML domains.

Abstract

This paper explores the application of bandit algorithms in both stochastic and adversarial settings, with a focus on theoretical analysis and practical applications. The study begins by introducing bandit problems, distinguishing between stochastic and adversarial variants, and examining key algorithms such as Explore-Then-Commit (ETC), Upper Confidence Bound (UCB), and Exponential-Weight Algorithm for Exploration and Exploitation (EXP3). Theoretical regret bounds are analyzed to compare the performance of these algorithms. The paper then introduces a novel framework, HyperArm Bandit Optimization (HABO), which applies EXP3 to hyperparameter tuning in machine learning models. Unlike traditional methods that treat entire configurations as arms, HABO treats individual hyperparameters as super-arms, and its potential configurations as sub-arms, enabling dynamic resource allocation and efficient exploration. Experimental results demonstrate HABO's effectiveness in classification and regression tasks, outperforming Bayesian Optimization in terms of computational efficiency and accuracy. The paper concludes with insights into the convergence guarantees of HABO and its potential for scalable and robust hyperparameter optimization.

HyperArm Bandit Optimization: A Novel approach to Hyperparameter Optimization and an Analysis of Bandit Algorithms in Stochastic and Adversarial Settings

TL;DR

Abstract

HyperArm Bandit Optimization: A Novel approach to Hyperparameter Optimization and an Analysis of Bandit Algorithms in Stochastic and Adversarial Settings

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)