Table of Contents
Fetching ...

Architectural Influence on Variational Quantum Circuits in Multi-Agent Reinforcement Learning: Evolutionary Strategies for Optimization

Michael Kölle, Karola Schneider, Sabrina Egger, Felix Topp, Thomy Phan, Philipp Altmann, Jonas Nüßlein, Claudia Linnhoff-Popien

TL;DR

This paper addresses the parameter and gradient challenges in multi-agent reinforcement learning by adopting variational quantum circuits (VQCs) within a cooperative MARL setting. It extends prior work by introducing three architectural evolution strategies—Gate-Based, Layer-Based, and Prototype-Based—and evaluating them with an evolutionary algorithm in the Coin Game environment. The results show that mutation-only evolution generally outperforms recombination, with Gate-Based VQCs delivering the strongest performance, higher coin collection, and superior own-coin rates while using far fewer parameterized gates. Quantum agents consistently outperform random baselines and can match or exceed the performance of larger classical networks while achieving substantial parameter reductions (up to around 97.9% fewer parameters). These findings highlight the practical potential of architectural evolution for VQCs in MAQRL and point to promising directions for hardware experiments and broader quantum-classical hybrids.

Abstract

In recent years, Multi-Agent Reinforcement Learning (MARL) has found application in numerous areas of science and industry, such as autonomous driving, telecommunications, and global health. Nevertheless, MARL suffers from, for instance, an exponential growth of dimensions. Inherent properties of quantum mechanics help to overcome these limitations, e.g., by significantly reducing the number of trainable parameters. Previous studies have developed an approach that uses gradient-free quantum Reinforcement Learning and evolutionary optimization for variational quantum circuits (VQCs) to reduce the trainable parameters and avoid barren plateaus as well as vanishing gradients. This leads to a significantly better performance of VQCs compared to classical neural networks with a similar number of trainable parameters and a reduction in the number of parameters by more than 97 \% compared to similarly good neural networks. We extend an approach of Kölle et al. by proposing a Gate-Based, a Layer-Based, and a Prototype-Based concept to mutate and recombine VQCs. Our results show the best performance for mutation-only strategies and the Gate-Based approach. In particular, we observe a significantly better score, higher total and own collected coins, as well as a superior own coin rate for the best agent when evaluated in the Coin Game environment.

Architectural Influence on Variational Quantum Circuits in Multi-Agent Reinforcement Learning: Evolutionary Strategies for Optimization

TL;DR

This paper addresses the parameter and gradient challenges in multi-agent reinforcement learning by adopting variational quantum circuits (VQCs) within a cooperative MARL setting. It extends prior work by introducing three architectural evolution strategies—Gate-Based, Layer-Based, and Prototype-Based—and evaluating them with an evolutionary algorithm in the Coin Game environment. The results show that mutation-only evolution generally outperforms recombination, with Gate-Based VQCs delivering the strongest performance, higher coin collection, and superior own-coin rates while using far fewer parameterized gates. Quantum agents consistently outperform random baselines and can match or exceed the performance of larger classical networks while achieving substantial parameter reductions (up to around 97.9% fewer parameters). These findings highlight the practical potential of architectural evolution for VQCs in MAQRL and point to promising directions for hardware experiments and broader quantum-classical hybrids.

Abstract

In recent years, Multi-Agent Reinforcement Learning (MARL) has found application in numerous areas of science and industry, such as autonomous driving, telecommunications, and global health. Nevertheless, MARL suffers from, for instance, an exponential growth of dimensions. Inherent properties of quantum mechanics help to overcome these limitations, e.g., by significantly reducing the number of trainable parameters. Previous studies have developed an approach that uses gradient-free quantum Reinforcement Learning and evolutionary optimization for variational quantum circuits (VQCs) to reduce the trainable parameters and avoid barren plateaus as well as vanishing gradients. This leads to a significantly better performance of VQCs compared to classical neural networks with a similar number of trainable parameters and a reduction in the number of parameters by more than 97 \% compared to similarly good neural networks. We extend an approach of Kölle et al. by proposing a Gate-Based, a Layer-Based, and a Prototype-Based concept to mutate and recombine VQCs. Our results show the best performance for mutation-only strategies and the Gate-Based approach. In particular, we observe a significantly better score, higher total and own collected coins, as well as a superior own coin rate for the best agent when evaluated in the Coin Game environment.
Paper Structure (28 sections, 4 equations, 16 figures, 2 algorithms)

This paper contains 28 sections, 4 equations, 16 figures, 2 algorithms.

Figures (16)

  • Figure 1: Variational Quantum Circuit used by Kölle et al. kolle2023multi
  • Figure 2: Training Loop of Kölle et al.kolle2023multi
  • Figure 3: Training loop of approach containing architectural changes
  • Figure 4: Example State of the Coin Game phan2022emergent.
  • Figure 5: Average Score over the entire population. Each individual has completed 50 steps in the Coin Game environment in each generation kolle2023multi.
  • ...and 11 more figures