Table of Contents
Fetching ...

From Random Search to Bandit Learning in Metric Measure Spaces

Chuying Han, Yasong Feng, Tianyu Wang

TL;DR

This paper introduces the scattering dimension $d_s$ to provide a non-heuristic theory for Random Search in hyperparameter optimization, showing that the optimality gap decays as $\widetilde{O}((1/T)^{1/d_s})$ in noise-free settings and as $\widetilde{O}((1/T)^{1/(d_s+1)})$ under bounded iid noise. It connects $d_s$ to the zooming dimension $d_z$ and demonstrates that, in metric-measure spaces endowed with a probability measure, the BLiN-MOS algorithm achieves regret $\widetilde{O}(T^{d_z/(d_z+1)})$ with only $O(\log\log T)$ communication rounds. The work also clarifies the relationship between scattering and zooming dimensions, showing how landscape geometry governs both sampling efficiency and near-optimal-region structure, and emphasizes the necessity of a well-defined probability measure for scattering-dimension analysis. Overall, the results furnish the first non-heuristic justification for Random Search performance, propose a Lipschitz-bandit algorithm tailored to metric spaces, and quantify fundamental trade-offs between discrimination and exploration in high-dimensional landscapes.

Abstract

Random Search is one of the most widely-used method for Hyperparameter Optimization, and is critical to the success of deep learning models. Despite its astonishing performance, little non-heuristic theory has been developed to describe the underlying working mechanism. This paper gives a theoretical accounting of Random Search. We introduce the concept of \emph{scattering dimension} that describes the landscape of the underlying function, and quantifies the performance of random search. We show that, when the environment is noise-free, the output of random search converges to the optimal value in probability at rate $ \widetilde{\mathcal{O}} \left( \left( \frac{1}{T} \right)^{ \frac{1}{d_s} } \right) $, where $ d_s \ge 0 $ is the scattering dimension of the underlying function. When the observed function values are corrupted by bounded $iid$ noise, the output of random search converges to the optimal value in probability at rate $ \widetilde{\mathcal{O}} \left( \left( \frac{1}{T} \right)^{ \frac{1}{d_s + 1} } \right) $. In addition, based on the principles of random search, we introduce an algorithm, called BLiN-MOS, for Lipschitz bandits in doubling metric spaces that are also endowed with a probability measure, and show that under mild conditions, BLiN-MOS achieves a regret rate of order $ \widetilde{\mathcal{O}} \left( T^{ \frac{d_z}{d_z + 1} } \right) $, where $d_z$ is the zooming dimension of the problem instance.

From Random Search to Bandit Learning in Metric Measure Spaces

TL;DR

This paper introduces the scattering dimension to provide a non-heuristic theory for Random Search in hyperparameter optimization, showing that the optimality gap decays as in noise-free settings and as under bounded iid noise. It connects to the zooming dimension and demonstrates that, in metric-measure spaces endowed with a probability measure, the BLiN-MOS algorithm achieves regret with only communication rounds. The work also clarifies the relationship between scattering and zooming dimensions, showing how landscape geometry governs both sampling efficiency and near-optimal-region structure, and emphasizes the necessity of a well-defined probability measure for scattering-dimension analysis. Overall, the results furnish the first non-heuristic justification for Random Search performance, propose a Lipschitz-bandit algorithm tailored to metric spaces, and quantify fundamental trade-offs between discrimination and exploration in high-dimensional landscapes.

Abstract

Random Search is one of the most widely-used method for Hyperparameter Optimization, and is critical to the success of deep learning models. Despite its astonishing performance, little non-heuristic theory has been developed to describe the underlying working mechanism. This paper gives a theoretical accounting of Random Search. We introduce the concept of \emph{scattering dimension} that describes the landscape of the underlying function, and quantifies the performance of random search. We show that, when the environment is noise-free, the output of random search converges to the optimal value in probability at rate , where is the scattering dimension of the underlying function. When the observed function values are corrupted by bounded noise, the output of random search converges to the optimal value in probability at rate . In addition, based on the principles of random search, we introduce an algorithm, called BLiN-MOS, for Lipschitz bandits in doubling metric spaces that are also endowed with a probability measure, and show that under mild conditions, BLiN-MOS achieves a regret rate of order , where is the zooming dimension of the problem instance.
Paper Structure (18 sections, 12 theorems, 64 equations, 1 figure, 3 algorithms)

This paper contains 18 sections, 12 theorems, 64 equations, 1 figure, 3 algorithms.

Key Result

Proposition 1

Let $p \ge 1$, and let $g_p (x) : [0,1]^d \to \mathbb{R}$ be defined as $g_p (x) = 1 - \frac{1}{p} \| x \|_\infty^p$. The scattering dimension of $g_p$ is $d_s = \frac{d}{p}$ and the scattering constant of $g_p$ is $\kappa_s = 1$.

Figures (1)

  • Figure 1: The plots of $g_p$ with $p = 1,3,5,10$ over $[0,1]$.

Theorems & Definitions (30)

  • Definition 1
  • Remark 1
  • Definition 2
  • Proposition 1
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Proposition 2
  • proof
  • Definition 3: Canonical probability measure
  • ...and 20 more