Quality-Diversity with Limited Resources
Ren-Jian Wang, Ke Xue, Cong Guan, Chao Qian
TL;DR
Quality-Diversity seeks to produce a diverse set of high-quality solutions but is hampered by high resource demands during training due to large archives and populations. RefQD addresses this by decomposing neural networks into a shared representation part and multiple small decision parts, sharing the representation across all solutions, and introducing a deep decision archive with multi-level cells plus top-level re-evaluation and learning-rate decay to mitigate mismatch. Empirically, RefQD achieves comparable or better QD metrics than resource-intensive baselines while reducing GPU memory usage to as little as 3.7%–16% and RAM usage, across QDax and Atari tasks, including image-based observations. This work thus enables scalable, resource-efficient QD for larger problems and constrained hardware, with potential synergy with archive distillation and model pruning.
Abstract
Quality-Diversity (QD) algorithms have emerged as a powerful optimization paradigm with the aim of generating a set of high-quality and diverse solutions. To achieve such a challenging goal, QD algorithms require maintaining a large archive and a large population in each iteration, which brings two main issues, sample and resource efficiency. Most advanced QD algorithms focus on improving the sample efficiency, while the resource efficiency is overlooked to some extent. Particularly, the resource overhead during the training process has not been touched yet, hindering the wider application of QD algorithms. In this paper, we highlight this important research question, i.e., how to efficiently train QD algorithms with limited resources, and propose a novel and effective method called RefQD to address it. RefQD decomposes a neural network into representation and decision parts, and shares the representation part with all decision parts in the archive to reduce the resource overhead. It also employs a series of strategies to address the mismatch issue between the old decision parts and the newly updated representation part. Experiments on different types of tasks from small to large resource consumption demonstrate the excellent performance of RefQD: it not only uses significantly fewer resources (e.g., 16\% GPU memories on QDax and 3.7\% on Atari) but also achieves comparable or better performance compared to sample-efficient QD algorithms. Our code is available at \url{https://github.com/lamda-bbo/RefQD}.
