Distributed Evolution Strategies with Multi-Level Learning for Large-Scale Black-Box Optimization

Qiqi Duan; Chang Shao; Guochen Zhou; Minghan Zhang; Qi Zhao; Yuhui Shi

Distributed Evolution Strategies with Multi-Level Learning for Large-Scale Black-Box Optimization

Qiqi Duan, Chang Shao, Guochen Zhou, Minghan Zhang, Qi Zhao, Yuhui Shi

TL;DR

This work tackles large-scale black-box optimization by marrying LM-CMA with a multilevel learning-based meta-framework to exploit distributed computation. The outer CMA-ES governs meta-parameters while multiple inner LM-CMA solvers run in parallel, using isolation time to balance learning progress and communication; the framework introduces elitist and multi-recombination updates, spatiotemporal global step-size adaptation, and collective learning of CMA on structured populations. Empirical results on 2000-dimensional, memory-expensive benchmarks show competitive local and global search performance with quantifiable trade-offs between communication overhead and model richness, aided by a Ray-based distributed implementation. The approach offers a scalable path for distributed black-box optimization, with open-source code to support replication and further development.

Abstract

In the post-Moore era, main performance gains of black-box optimizers are increasingly depending on parallelism, especially for large-scale optimization (LSO). Here we propose to parallelize the well-established covariance matrix adaptation evolution strategy (CMA-ES) and in particular its one latest LSO variant called limited-memory CMA-ES (LM-CMA). To achieve efficiency while approximating its powerful invariance property, we present a multilevel learning-based meta-framework for distributed LM-CMA. Owing to its hierarchically organized structure, Meta-ES is well-suited to implement our distributed meta-framework, wherein the outer-ES controls strategy parameters while all parallel inner-ESs run the serial LM-CMA with different settings. For the distribution mean update of the outer-ES, both the elitist and multi-recombination strategy are used in parallel to avoid stagnation and regression, respectively. To exploit spatiotemporal information, the global step-size adaptation combines Meta-ES with the parallel cumulative step-size adaptation. After each isolation time, our meta-framework employs both the structure and parameter learning strategy to combine aligned evolution paths for CMA reconstruction. Experiments on a set of large-scale benchmarking functions with memory-intensive evaluations, arguably reflecting many data-driven optimization problems, validate the benefits (e.g., effectiveness w.r.t. solution quality, and adaptability w.r.t. second-order learning) and costs of our meta-framework.

Distributed Evolution Strategies with Multi-Level Learning for Large-Scale Black-Box Optimization

TL;DR

Abstract

Paper Structure (17 sections, 5 equations, 8 figures, 2 tables, 1 algorithm)

This paper contains 17 sections, 5 equations, 8 figures, 2 tables, 1 algorithm.

Introduction
Related Works
Parallel/Distributed Evolution Strategies
Large-Scale Variants of CMA-ES
A Multilevel Meta-Framework for DES
Hierarchical Organization of LM-CMA via Meta-ES
Distribution Mean Update at the Outer-ES Level
Spatiotemporal Global Step-Size Adaptation (STA)
Collective Learning of CMA on Structured Populations
A Meta-Framework for DES
Large-Scale Numerical Experiments
Experimental Settings
Comparing Local Search Abilities
Comparing Global Search Abilities
Overhead Analysis of Memory Communications
...and 2 more sections

Figures (8)

Figure 1: The flowchart diagram of our proposed approach (\ref{['subsec:meta_framework_des']}) consisting of four components: \ref{['subsec:hierarchical_organization']}) hierarchical organization of LM-CMA via Meta-ES, \ref{['subsec:update_outer_es_mean']}) distribution mean update at the outer-ES level, \ref{['subsec:adapt_global_step_size']}) spatiotemporal global step-size adaptation, and \ref{['subsec:collective_learning_cma']}) collective learning of CMA reconstruction on structured populations.
Figure 2: Median convergence curves on a set of 2000-d unimodal functions given the maximal runtime (3 hours) and the cost threshold ($1e^{-10}$).
Figure 3: Median convergence curves on a set of 2000-d unimodal functions given the maximal runtime (3 hours) and the cost threshold ($1e^{-10}$).
Figure 4: Median convergence curves on a set of 2000-d multimodal functions given the maximal runtime (3 hours) and the cost threshold ($1e^{-10}$).
Figure 5: Median convergence curves on a set of 2000-d multimodal functions given the maximal runtime (3 hours) and the cost threshold ($1e^{-10}$).
...and 3 more figures

Distributed Evolution Strategies with Multi-Level Learning for Large-Scale Black-Box Optimization

TL;DR

Abstract

Distributed Evolution Strategies with Multi-Level Learning for Large-Scale Black-Box Optimization

Authors

TL;DR

Abstract

Table of Contents

Figures (8)