Table of Contents
Fetching ...

Theoretical Analysis of Explicit Averaging and Novel Sign Averaging in Comparison-Based Search

Daiki Morinaga, Youhei Akimoto

TL;DR

The paper tackles the impact of noise on comparison-based black-box optimization, revealing that explicit averaging can harm ground-truth rankings when the noise is heavy-tailed and may even fail when the mean does not exist. It establishes a theoretical framework under stable distributions to quantify order estimation probability (OEP) and proves that explicit averaging is effective only for $\alpha\in(1,2]$, neutral at $\alpha=1$, and detrimental for $0<\alpha<1$. To address these limitations, the authors introduce sign averaging, proving that estimating the order of medians via sign comparisons remains reliable for all $\alpha\in(0,2]$ under symmetry and uniqueness assumptions, and they propose a practical weighting scheme to incorporate sign averaging into CMA-ES. Numerical experiments validate the theory, showing sign averaging often outperforms explicit averaging, especially for heavy-tailed noise, and demonstrate how the proposed weighting can leverage ranking information for robust optimization in noisy, comparison-based settings.

Abstract

In black-box optimization, noise in the objective function is inevitable. Noise disrupts the ranking of candidate solutions in comparison-based optimization, possibly deteriorating the search performance compared with a noiseless scenario. Explicit averaging takes the sample average of noisy objective function values and is widely used as a simple and versatile noise-handling technique. Although it is suitable for various applications, it is ineffective if the mean is not finite. We theoretically reveal that explicit averaging has a negative effect on the estimation of ground-truth rankings when assuming stably distributed noise without a finite mean. Alternatively, sign averaging is proposed as a simple but robust noise-handling technique. We theoretically prove that the sign averaging estimates the order of the medians of the noisy objective function values of a pair of points with arbitrarily high probability as the number of samples increases. Its advantages over explicit averaging and its robustness are also confirmed through numerical experiments.

Theoretical Analysis of Explicit Averaging and Novel Sign Averaging in Comparison-Based Search

TL;DR

The paper tackles the impact of noise on comparison-based black-box optimization, revealing that explicit averaging can harm ground-truth rankings when the noise is heavy-tailed and may even fail when the mean does not exist. It establishes a theoretical framework under stable distributions to quantify order estimation probability (OEP) and proves that explicit averaging is effective only for , neutral at , and detrimental for . To address these limitations, the authors introduce sign averaging, proving that estimating the order of medians via sign comparisons remains reliable for all under symmetry and uniqueness assumptions, and they propose a practical weighting scheme to incorporate sign averaging into CMA-ES. Numerical experiments validate the theory, showing sign averaging often outperforms explicit averaging, especially for heavy-tailed noise, and demonstrate how the proposed weighting can leverage ranking information for robust optimization in noisy, comparison-based settings.

Abstract

In black-box optimization, noise in the objective function is inevitable. Noise disrupts the ranking of candidate solutions in comparison-based optimization, possibly deteriorating the search performance compared with a noiseless scenario. Explicit averaging takes the sample average of noisy objective function values and is widely used as a simple and versatile noise-handling technique. Although it is suitable for various applications, it is ineffective if the mean is not finite. We theoretically reveal that explicit averaging has a negative effect on the estimation of ground-truth rankings when assuming stably distributed noise without a finite mean. Alternatively, sign averaging is proposed as a simple but robust noise-handling technique. We theoretically prove that the sign averaging estimates the order of the medians of the noisy objective function values of a pair of points with arbitrarily high probability as the number of samples increases. Its advantages over explicit averaging and its robustness are also confirmed through numerical experiments.
Paper Structure (29 sections, 10 theorems, 50 equations, 2 figures)

This paper contains 29 sections, 10 theorems, 50 equations, 2 figures.

Key Result

Proposition 1

The following properties hold for stable distributions:

Figures (2)

  • Figure 1: Results of 10 runs of CMA-ES with explicit averaging. The solid lines indicate the median of the moving average of Tau-b (left axis) with sample sizes $K=1$ ($\bullet$), $K=10$ ($\blacktriangledown$), and $K=50$ ($\blacksquare$). The span of the moving average is $10$. The dashed lines indicate $-\log(f(m_t; \Delta))$ (right axis) with sample sizes $K=1$ ($\star$), $K=10$ ($\times$), and $K=50$ ($\blacklozenge$). The top and bottom of the band correspond to the 75% and 25% values among trials. The medians are taken over $10$ trials. (ANE, additive-noise ellipsoid; LNE, linear-noise ellipsoid; MNE, multiplicative-noise ellipsoid)
  • Figure 2: Results of 10 runs of CMA-ES with sign averaging. The solid lines indicate the median of the moving average of Tau-b (left axis) with sample sizes $K=1$ ($\bullet$), $K=10$ ($\blacktriangledown$), and $K=50$ ($\blacksquare$). The dashed lines indicate $-\log(f(m_t; \Delta))$ (right axis) with sample sizes $K=1$ ($\star$), $K=10$ ($\times$), and $K=50$ ($\blacklozenge$).

Theorems & Definitions (21)

  • definition 1: Stable distribution ${\bm S}(\alpha, \beta, \gamma, \delta)$ in Definition 1.8 of nolan2020stable
  • Proposition 1: Linear transformation of stable distribution in Proposition 1.17 of nolan2020stable
  • definition 2: OEP for $(x_1, \Delta)$ and $(x_2, \Delta)$
  • Lemma 2: Distribution of the difference of two averages
  • Theorem 3: OEP over stable distribution
  • Corollary 4: OEP over ${\bf S}(\alpha, 0, \gamma, \delta)$
  • Proposition 5: Sufficient condition for \ref{['asm:symmetric']}
  • Proposition 6: Sufficient condition for \ref{['asm:symmetric']}
  • Proposition 7: Sufficient condition for \ref{['asm:uniqueness']}
  • definition 3: OEP on median
  • ...and 11 more