Table of Contents
Fetching ...

Quality-Diversity with Limited Resources

Ren-Jian Wang, Ke Xue, Cong Guan, Chao Qian

TL;DR

Quality-Diversity seeks to produce a diverse set of high-quality solutions but is hampered by high resource demands during training due to large archives and populations. RefQD addresses this by decomposing neural networks into a shared representation part and multiple small decision parts, sharing the representation across all solutions, and introducing a deep decision archive with multi-level cells plus top-level re-evaluation and learning-rate decay to mitigate mismatch. Empirically, RefQD achieves comparable or better QD metrics than resource-intensive baselines while reducing GPU memory usage to as little as 3.7%–16% and RAM usage, across QDax and Atari tasks, including image-based observations. This work thus enables scalable, resource-efficient QD for larger problems and constrained hardware, with potential synergy with archive distillation and model pruning.

Abstract

Quality-Diversity (QD) algorithms have emerged as a powerful optimization paradigm with the aim of generating a set of high-quality and diverse solutions. To achieve such a challenging goal, QD algorithms require maintaining a large archive and a large population in each iteration, which brings two main issues, sample and resource efficiency. Most advanced QD algorithms focus on improving the sample efficiency, while the resource efficiency is overlooked to some extent. Particularly, the resource overhead during the training process has not been touched yet, hindering the wider application of QD algorithms. In this paper, we highlight this important research question, i.e., how to efficiently train QD algorithms with limited resources, and propose a novel and effective method called RefQD to address it. RefQD decomposes a neural network into representation and decision parts, and shares the representation part with all decision parts in the archive to reduce the resource overhead. It also employs a series of strategies to address the mismatch issue between the old decision parts and the newly updated representation part. Experiments on different types of tasks from small to large resource consumption demonstrate the excellent performance of RefQD: it not only uses significantly fewer resources (e.g., 16\% GPU memories on QDax and 3.7\% on Atari) but also achieves comparable or better performance compared to sample-efficient QD algorithms. Our code is available at \url{https://github.com/lamda-bbo/RefQD}.

Quality-Diversity with Limited Resources

TL;DR

Quality-Diversity seeks to produce a diverse set of high-quality solutions but is hampered by high resource demands during training due to large archives and populations. RefQD addresses this by decomposing neural networks into a shared representation part and multiple small decision parts, sharing the representation across all solutions, and introducing a deep decision archive with multi-level cells plus top-level re-evaluation and learning-rate decay to mitigate mismatch. Empirically, RefQD achieves comparable or better QD metrics than resource-intensive baselines while reducing GPU memory usage to as little as 3.7%–16% and RAM usage, across QDax and Atari tasks, including image-based observations. This work thus enables scalable, resource-efficient QD for larger problems and constrained hardware, with potential synergy with archive distillation and model pruning.

Abstract

Quality-Diversity (QD) algorithms have emerged as a powerful optimization paradigm with the aim of generating a set of high-quality and diverse solutions. To achieve such a challenging goal, QD algorithms require maintaining a large archive and a large population in each iteration, which brings two main issues, sample and resource efficiency. Most advanced QD algorithms focus on improving the sample efficiency, while the resource efficiency is overlooked to some extent. Particularly, the resource overhead during the training process has not been touched yet, hindering the wider application of QD algorithms. In this paper, we highlight this important research question, i.e., how to efficiently train QD algorithms with limited resources, and propose a novel and effective method called RefQD to address it. RefQD decomposes a neural network into representation and decision parts, and shares the representation part with all decision parts in the archive to reduce the resource overhead. It also employs a series of strategies to address the mismatch issue between the old decision parts and the newly updated representation part. Experiments on different types of tasks from small to large resource consumption demonstrate the excellent performance of RefQD: it not only uses significantly fewer resources (e.g., 16\% GPU memories on QDax and 3.7\% on Atari) but also achieves comparable or better performance compared to sample-efficient QD algorithms. Our code is available at \url{https://github.com/lamda-bbo/RefQD}.
Paper Structure (22 sections, 4 equations, 10 figures, 5 tables, 2 algorithms)

This paper contains 22 sections, 4 equations, 10 figures, 5 tables, 2 algorithms.

Figures (10)

  • Figure 1: Performance and resource comparisons between RefQD and the baselines.
  • Figure 2: Performance comparison in terms of QD-Score, Coverage, and Max Fitness on eight environments of QDax. The medians and the first and third quartile intervals are depicted with curves and shaded areas, respectively.
  • Figure 3: Performance comparison in terms of QD-Score, Coverage, and Max Fitness using CNN as policy networks on two environments of Atari. The medians and the first and third quartile intervals are depicted with curves and shaded areas, respectively.
  • Figure 4: QD-Score of RefQD with different period $T_r$ of re-evaluation on four environments of QDax.
  • Figure 5: QD-Score of RefQD with different number $k$ of top-$k$ re-evaluation on four environments of QDax.
  • ...and 5 more figures