From Random Search to Bandit Learning in Metric Measure Spaces

Chuying Han; Yasong Feng; Tianyu Wang

From Random Search to Bandit Learning in Metric Measure Spaces

Chuying Han, Yasong Feng, Tianyu Wang

TL;DR

This paper introduces the scattering dimension $d_s$ to provide a non-heuristic theory for Random Search in hyperparameter optimization, showing that the optimality gap decays as $\widetilde{O}((1/T)^{1/d_s})$ in noise-free settings and as $\widetilde{O}((1/T)^{1/(d_s+1)})$ under bounded iid noise. It connects $d_s$ to the zooming dimension $d_z$ and demonstrates that, in metric-measure spaces endowed with a probability measure, the BLiN-MOS algorithm achieves regret $\widetilde{O}(T^{d_z/(d_z+1)})$ with only $O(\log\log T)$ communication rounds. The work also clarifies the relationship between scattering and zooming dimensions, showing how landscape geometry governs both sampling efficiency and near-optimal-region structure, and emphasizes the necessity of a well-defined probability measure for scattering-dimension analysis. Overall, the results furnish the first non-heuristic justification for Random Search performance, propose a Lipschitz-bandit algorithm tailored to metric spaces, and quantify fundamental trade-offs between discrimination and exploration in high-dimensional landscapes.

Abstract

Random Search is one of the most widely-used method for Hyperparameter Optimization, and is critical to the success of deep learning models. Despite its astonishing performance, little non-heuristic theory has been developed to describe the underlying working mechanism. This paper gives a theoretical accounting of Random Search. We introduce the concept of \emph{scattering dimension} that describes the landscape of the underlying function, and quantifies the performance of random search. We show that, when the environment is noise-free, the output of random search converges to the optimal value in probability at rate $ \widetilde{\mathcal{O}} \left( \left( \frac{1}{T} \right)^{ \frac{1}{d_s} } \right) $, where $ d_s \ge 0 $ is the scattering dimension of the underlying function. When the observed function values are corrupted by bounded $iid$ noise, the output of random search converges to the optimal value in probability at rate $ \widetilde{\mathcal{O}} \left( \left( \frac{1}{T} \right)^{ \frac{1}{d_s + 1} } \right) $. In addition, based on the principles of random search, we introduce an algorithm, called BLiN-MOS, for Lipschitz bandits in doubling metric spaces that are also endowed with a probability measure, and show that under mild conditions, BLiN-MOS achieves a regret rate of order $ \widetilde{\mathcal{O}} \left( T^{ \frac{d_z}{d_z + 1} } \right) $, where $d_z$ is the zooming dimension of the problem instance.

From Random Search to Bandit Learning in Metric Measure Spaces

TL;DR

This paper introduces the scattering dimension

to provide a non-heuristic theory for Random Search in hyperparameter optimization, showing that the optimality gap decays as

in noise-free settings and as

under bounded iid noise. It connects

to the zooming dimension

and demonstrates that, in metric-measure spaces endowed with a probability measure, the BLiN-MOS algorithm achieves regret

with only

communication rounds. The work also clarifies the relationship between scattering and zooming dimensions, showing how landscape geometry governs both sampling efficiency and near-optimal-region structure, and emphasizes the necessity of a well-defined probability measure for scattering-dimension analysis. Overall, the results furnish the first non-heuristic justification for Random Search performance, propose a Lipschitz-bandit algorithm tailored to metric spaces, and quantify fundamental trade-offs between discrimination and exploration in high-dimensional landscapes.

Abstract

, where

is the scattering dimension of the underlying function. When the observed function values are corrupted by bounded

noise, the output of random search converges to the optimal value in probability at rate

. In addition, based on the principles of random search, we introduce an algorithm, called BLiN-MOS, for Lipschitz bandits in doubling metric spaces that are also endowed with a probability measure, and show that under mild conditions, BLiN-MOS achieves a regret rate of order

, where

is the zooming dimension of the problem instance.

Paper Structure (18 sections, 12 theorems, 64 equations, 1 figure, 3 algorithms)

This paper contains 18 sections, 12 theorems, 64 equations, 1 figure, 3 algorithms.

Introduction
Related Works
Understanding Random Search via the Scattering Dimension
Important Special Cases
Scattering Dimension of Norm Polynomials
The Random Search Algorithm
The Random Search Algorithm in Noisy Environments
Zooming Dimension versus Scattering Dimension
The Curse and Blessing of Zooming Dimension
A Numeric Example
Scattering Dimension Requires a Probability Measure
The BLiN-MOS Algorithm
Notations and Conventions
Analysis of BLiN-MOS
BLiN-MOS with Improved Communication Complexity
...and 3 more sections

Key Result

Proposition 1

Let $p \ge 1$, and let $g_p (x) : [0,1]^d \to \mathbb{R}$ be defined as $g_p (x) = 1 - \frac{1}{p} \| x \|_\infty^p$. The scattering dimension of $g_p$ is $d_s = \frac{d}{p}$ and the scattering constant of $g_p$ is $\kappa_s = 1$.

Figures (1)

Figure 1: The plots of $g_p$ with $p = 1,3,5,10$ over $[0,1]$.

Theorems & Definitions (30)

Definition 1
Remark 1
Definition 2
Proposition 1
Theorem 1
Theorem 2
Theorem 3
Proposition 2
proof
Definition 3: Canonical probability measure
...and 20 more

From Random Search to Bandit Learning in Metric Measure Spaces

TL;DR

Abstract

From Random Search to Bandit Learning in Metric Measure Spaces

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (30)