Massively parallel CMA-ES with increasing population

David Redon; Pierre Fortin; Bilel Derbel; Miwako Tsuji; Mitsuhisa Sato

Massively parallel CMA-ES with increasing population

David Redon, Pierre Fortin, Bilel Derbel, Miwako Tsuji, Mitsuhisa Sato

TL;DR

This paper shows how BLAS and LAPACK routines can be introduced in linear algebra operations, and proposes two strategies for deploying IPOP‐CMA‐ES efficiently on large‐scale parallel architectures with up to thousands of CPU cores.

Abstract

The Increasing Population Covariance Matrix Adaptation Evolution Strategy (IPOP-CMA-ES) algorithm is a reference stochastic optimizer dedicated to blackbox optimization, where no prior knowledge about the underlying problem structure is available. This paper aims at accelerating IPOP-CMA-ES thanks to high performance computing and parallelism when solving large optimization problems. We first show how BLAS and LAPACK routines can be introduced in linear algebra operations, and we then propose two strategies for deploying IPOP-CMA-ES efficiently on large-scale parallel architectures with thousands of CPU cores. The first parallel strategy processes the multiple searches in the same ordering as the sequential IPOP-CMA-ES, while the second one processes concurrently these multiple searches. These strategies are implemented in MPI+OpenMP and compared on 6144 cores of the supercomputer Fugaku. We manage to obtain substantial speedups (up to several thousand) and even super-linear ones, and we provide an in-depth analysis of our results to understand precisely the superior performance of our second strategy.

Massively parallel CMA-ES with increasing population

TL;DR

Abstract

Paper Structure (14 sections, 2 equations, 9 figures, 5 tables, 3 algorithms)

This paper contains 14 sections, 2 equations, 9 figures, 5 tables, 3 algorithms.

Introduction
CMA‐ES with increasing population
The Covariance Matrix Adaptation Evolution Strategy
The increasing population restart strategy
High performance parallel strategies
High performance linear algebra
The parallel strategies
Parallelism within a CMA-ES descent
The K-Replicated strategy
The K-Distributed strategy
Experimental setup
Performance assessment methodology
Conclusion
Acknowledgments

Figures (9)

Figure 1: Convergence example of CMA-ES on a function space. The white dot indicates the function optimum, the red ellipse the normal law, and the red crosses points sampled according to this law.
Figure 2: Illustration of the core occupancy of a naive version of IPOP-CMA-ES with successive parallel descents.
Figure 3: Illustration of the core occupancy of the K-Replicated strategy.
Figure 4: Illustration of the K-Distributed algorithm.
Figure 5: (upper-left) Performance gains for the eigendecomposition of the $C$ matrix when using LAPACK over the reference C code (written without LAPACK). (upper-right, resp. lower-left) Performance gains for the adaptation of the $C$ matrix (resp. for the sampling) when using Level 2 or Level 3 BLAS over the reference C code (without BLAS). (lower-right) Performance gains over the reference C code (without BLAS and LAPACK) for all the linear algebra part, with LAPACK for the eigendecomposition and Level 3 BLAS for the $C$ matrix adaptation, when using Level 2 or Level 3 BLAS routines for the sampling. The IPOP columns correspond to a IPOP-CMA-ES execution with successive descents using $K$ from 1 to $2^8$.
...and 4 more figures

Massively parallel CMA-ES with increasing population

TL;DR

Abstract

Massively parallel CMA-ES with increasing population

Authors

TL;DR

Abstract

Table of Contents

Figures (9)