Table of Contents
Fetching ...

Multi-Agent Quantum Reinforcement Learning using Evolutionary Optimization

Michael Kölle, Felix Topp, Thomy Phan, Philipp Altmann, Jonas Nüßlein, Claudia Linnhoff-Popien

TL;DR

The paper addresses the challenge of scalable multi-agent reinforcement learning (MARL) with reduced parameter requirements by proposing a gradient-free, evolutionary optimization approach using Variational Quantum Circuits (VQCs) for multi-agent Quantum Reinforcement Learning (MAQRL). It systematically evaluates three genetic variations within an MA framework on the Coin Game, comparing against classical neural networks with both similar and substantially larger parameter counts. The results show that VQCs can outperform small neural networks and achieve comparable performance to large networks while using up to 97.88% fewer parameters, highlighting strong parameter efficiency. This work demonstrates the potential of quantum-inspired, gradient-free learning for MARL and points to future work in hardware implementations and broader baselines.

Abstract

Multi-Agent Reinforcement Learning is becoming increasingly more important in times of autonomous driving and other smart industrial applications. Simultaneously a promising new approach to Reinforcement Learning arises using the inherent properties of quantum mechanics, reducing the trainable parameters of a model significantly. However, gradient-based Multi-Agent Quantum Reinforcement Learning methods often have to struggle with barren plateaus, holding them back from matching the performance of classical approaches. While gradient free Quantum Reinforcement Learning methods may alleviate some of these challenges, they too are not immune to the difficulties posed by barren plateaus. We build upon an existing approach for gradient free Quantum Reinforcement Learning and propose three genetic variations with Variational Quantum Circuits for Multi-Agent Reinforcement Learning using evolutionary optimization. We evaluate our genetic variations in the Coin Game environment and also compare them to classical approaches. We showed that our Variational Quantum Circuit approaches perform significantly better compared to a neural network with a similar amount of trainable parameters. Compared to the larger neural network, our approaches archive similar results using $97.88\%$ less parameters.

Multi-Agent Quantum Reinforcement Learning using Evolutionary Optimization

TL;DR

The paper addresses the challenge of scalable multi-agent reinforcement learning (MARL) with reduced parameter requirements by proposing a gradient-free, evolutionary optimization approach using Variational Quantum Circuits (VQCs) for multi-agent Quantum Reinforcement Learning (MAQRL). It systematically evaluates three genetic variations within an MA framework on the Coin Game, comparing against classical neural networks with both similar and substantially larger parameter counts. The results show that VQCs can outperform small neural networks and achieve comparable performance to large networks while using up to 97.88% fewer parameters, highlighting strong parameter efficiency. This work demonstrates the potential of quantum-inspired, gradient-free learning for MARL and points to future work in hardware implementations and broader baselines.

Abstract

Multi-Agent Reinforcement Learning is becoming increasingly more important in times of autonomous driving and other smart industrial applications. Simultaneously a promising new approach to Reinforcement Learning arises using the inherent properties of quantum mechanics, reducing the trainable parameters of a model significantly. However, gradient-based Multi-Agent Quantum Reinforcement Learning methods often have to struggle with barren plateaus, holding them back from matching the performance of classical approaches. While gradient free Quantum Reinforcement Learning methods may alleviate some of these challenges, they too are not immune to the difficulties posed by barren plateaus. We build upon an existing approach for gradient free Quantum Reinforcement Learning and propose three genetic variations with Variational Quantum Circuits for Multi-Agent Reinforcement Learning using evolutionary optimization. We evaluate our genetic variations in the Coin Game environment and also compare them to classical approaches. We showed that our Variational Quantum Circuit approaches perform significantly better compared to a neural network with a similar amount of trainable parameters. Compared to the larger neural network, our approaches archive similar results using less parameters.
Paper Structure (22 sections, 8 equations, 9 figures, 1 algorithm)

This paper contains 22 sections, 8 equations, 9 figures, 1 algorithm.

Figures (9)

  • Figure 1: Structure of a Variational Quantum Circuit
  • Figure 2: Variational Quantum Circuit
  • Figure 3: Example State of the Coin Game by phanAAMAS22.
  • Figure 4: Average Score over the entire population. Each individual has completed 50 steps in the Coin Game environment each generation.
  • Figure 5: Comparison of (a) average coins collected, (b) average own coins collected and the own coin rate (c) in a 50 step Coin Game each generation, averaged over 10 seeds.
  • ...and 4 more figures