Table of Contents
Fetching ...

Bencher: Simple and Reproducible Benchmarking for Black-Box Optimization

Leonard Papenmeier, Luigi Nardi

TL;DR

Bencher tackles reproducibility challenges in black-box optimization benchmarking by isolating each benchmark in its own Python environment and exposing a version-agnostic RPC interface. The server–client architecture, combined with containerization via Docker and Singularity, decouples benchmarking from optimization algorithms and enables easy integration of real-world benchmarks. It currently supports around 80 benchmarks across continuous, categorical, and binary domains, with unit hypercube normalization for continuous problems, facilitating fair, repeatable comparisons in local and HPC settings. This approach reduces dependency conflicts and lowers setup overhead, enhancing the practicality and portability of benchmark studies for researchers and practitioners alike.

Abstract

We present Bencher, a modular benchmarking framework for black-box optimization that fundamentally decouples benchmark execution from optimization logic. Unlike prior suites that focus on combining many benchmarks in a single project, Bencher introduces a clean abstraction boundary: each benchmark is isolated in its own virtual Python environment and accessed via a unified, version-agnostic remote procedure call (RPC) interface. This design eliminates dependency conflicts and simplifies the integration of diverse, real-world benchmarks, which often have complex and conflicting software requirements. Bencher can be deployed locally or remotely via Docker or on high-performance computing (HPC) clusters via Singularity, providing a containerized, reproducible runtime for any benchmark. Its lightweight client requires minimal setup and supports drop-in evaluation of 80 benchmarks across continuous, categorical, and binary domains.

Bencher: Simple and Reproducible Benchmarking for Black-Box Optimization

TL;DR

Bencher tackles reproducibility challenges in black-box optimization benchmarking by isolating each benchmark in its own Python environment and exposing a version-agnostic RPC interface. The server–client architecture, combined with containerization via Docker and Singularity, decouples benchmarking from optimization algorithms and enables easy integration of real-world benchmarks. It currently supports around 80 benchmarks across continuous, categorical, and binary domains, with unit hypercube normalization for continuous problems, facilitating fair, repeatable comparisons in local and HPC settings. This approach reduces dependency conflicts and lowers setup overhead, enhancing the practicality and portability of benchmark studies for researchers and practitioners alike.

Abstract

We present Bencher, a modular benchmarking framework for black-box optimization that fundamentally decouples benchmark execution from optimization logic. Unlike prior suites that focus on combining many benchmarks in a single project, Bencher introduces a clean abstraction boundary: each benchmark is isolated in its own virtual Python environment and accessed via a unified, version-agnostic remote procedure call (RPC) interface. This design eliminates dependency conflicts and simplifies the integration of diverse, real-world benchmarks, which often have complex and conflicting software requirements. Bencher can be deployed locally or remotely via Docker or on high-performance computing (HPC) clusters via Singularity, providing a containerized, reproducible runtime for any benchmark. Its lightweight client requires minimal setup and supports drop-in evaluation of 80 benchmarks across continuous, categorical, and binary domains.

Paper Structure

This paper contains 11 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: The Bencher architecture. The server runs in a Docker container and listens to RPC from clients, which can be on the same or on a different machine. The server is composed of multiple Poetry environments, one for each benchmark.
  • Figure 2: The Bencher directory structure.