Sequential Multi-Agent Dynamic Algorithm Configuration
Chen Lu, Ke Xue, Lei Yuan, Yao Wang, Yaoyuan Wang, Sheng Fu, Chao Qian
TL;DR
The paper tackles dynamic algorithm configuration for complex algorithms with inter-dependent hyperparameters by modeling the task as a contextual sequential multi-agent MMDP. It introduces SADN, a Sequential Advantage Decomposition Network, which decomposes the global advantage into sequential, per-agent advantages, while satisfying the Individual Global Max principle to enable efficient decentralized execution. Empirical results on synthetic DAC benchmarks and the MOEA/D problem show that SADN outperforms state-of-the-art MARL baselines and generalizes across problem classes, validating the dependency-aware sequential approach. The work provides a new paradigm for automated algorithm configuration that explicitly accounts for parameter inter-dependencies and action ordering, with open-source code for reproducibility and broader applicability in complex optimization tasks.
Abstract
Dynamic algorithm configuration (DAC) is a recent trend in automated machine learning, which can dynamically adjust the algorithm's configuration during the execution process and relieve users from tedious trial-and-error tuning tasks. Recently, multi-agent reinforcement learning (MARL) approaches have improved the configuration of multiple heterogeneous hyperparameters, making various parameter configurations for complex algorithms possible. However, many complex algorithms have inherent inter-dependencies among multiple parameters (e.g., determining the operator type first and then the operator's parameter), which are, however, not considered in previous approaches, thus leading to sub-optimal results. In this paper, we propose the sequential multi-agent DAC (Seq-MADAC) framework to address this issue by considering the inherent inter-dependencies of multiple parameters. Specifically, we propose a sequential advantage decomposition network, which can leverage action-order information through sequential advantage decomposition. Experiments from synthetic functions to the configuration of multi-objective optimization algorithms demonstrate Seq-MADAC's superior performance over state-of-the-art MARL methods and show strong generalization across problem classes. Seq-MADAC establishes a new paradigm for the widespread dependency-aware automated algorithm configuration. Our code is available at https://github.com/lamda-bbo/seq-madac.
