Table of Contents
Fetching ...

Variational Quantum Rainbow Deep Q-Network for Optimizing Resource Allocation Problem

Truong Thanh Hung Nguyen, Truong Thinh Nguyen, Hung Cao

TL;DR

This work tackles the resource allocation problem by framing HRAP as an MDP and introducing VQR-DQN, a hybrid quantum-classical DRL agent that embeds ring-topology variational quantum circuits as feature extractors within Rainbow DQN. By combining quantum representations with distributional Q-learning, prioritized replay, noisy networks, and multi-step targets, the approach achieves notable reductions in makespan and outperforms strong baselines across HRAP benchmarks. The study also analyzes how VQC topology influences expressibility and entanglement, finding Ring topologies provide broader entanglement and improved learning dynamics. These results showcase the potential of quantum-enhanced DRL for large-scale, combinatorial resource allocation as quantum hardware becomes more accessible.

Abstract

Resource allocation remains NP-hard due to combinatorial complexity. While deep reinforcement learning (DRL) methods, such as the Rainbow Deep Q-Network (DQN), improve scalability through prioritized replay and distributional heads, classical function approximators limit their representational power. We introduce Variational Quantum Rainbow DQN (VQR-DQN), which integrates ring-topology variational quantum circuits with Rainbow DQN to leverage quantum superposition and entanglement. We frame the human resource allocation problem (HRAP) as a Markov decision process (MDP) with combinatorial action spaces based on officer capabilities, event schedules, and transition times. On four HRAP benchmarks, VQR-DQN achieves 26.8% normalized makespan reduction versus random baselines and outperforms Double DQN and classical Rainbow DQN by 4.9-13.4%. These gains align with theoretical connections between circuit expressibility, entanglement, and policy quality, demonstrating the potential of quantum-enhanced DRL for large-scale resource allocation. Our implementation is available at: https://github.com/Analytics-Everywhere-Lab/qtrl/.

Variational Quantum Rainbow Deep Q-Network for Optimizing Resource Allocation Problem

TL;DR

This work tackles the resource allocation problem by framing HRAP as an MDP and introducing VQR-DQN, a hybrid quantum-classical DRL agent that embeds ring-topology variational quantum circuits as feature extractors within Rainbow DQN. By combining quantum representations with distributional Q-learning, prioritized replay, noisy networks, and multi-step targets, the approach achieves notable reductions in makespan and outperforms strong baselines across HRAP benchmarks. The study also analyzes how VQC topology influences expressibility and entanglement, finding Ring topologies provide broader entanglement and improved learning dynamics. These results showcase the potential of quantum-enhanced DRL for large-scale, combinatorial resource allocation as quantum hardware becomes more accessible.

Abstract

Resource allocation remains NP-hard due to combinatorial complexity. While deep reinforcement learning (DRL) methods, such as the Rainbow Deep Q-Network (DQN), improve scalability through prioritized replay and distributional heads, classical function approximators limit their representational power. We introduce Variational Quantum Rainbow DQN (VQR-DQN), which integrates ring-topology variational quantum circuits with Rainbow DQN to leverage quantum superposition and entanglement. We frame the human resource allocation problem (HRAP) as a Markov decision process (MDP) with combinatorial action spaces based on officer capabilities, event schedules, and transition times. On four HRAP benchmarks, VQR-DQN achieves 26.8% normalized makespan reduction versus random baselines and outperforms Double DQN and classical Rainbow DQN by 4.9-13.4%. These gains align with theoretical connections between circuit expressibility, entanglement, and policy quality, demonstrating the potential of quantum-enhanced DRL for large-scale resource allocation. Our implementation is available at: https://github.com/Analytics-Everywhere-Lab/qtrl/.

Paper Structure

This paper contains 38 sections, 17 equations, 4 figures, 2 tables, 2 algorithms.

Figures (4)

  • Figure 1: Architecture of VQR-DQN in solving HRAP environment. Ring-topology VQCs are integrated into the Rainbow DQN pipeline, combining noisy exploration, prioritized replay, $n$-step returns, Double DQN, and dueling distributional $Q$-learning.
  • Figure 2: The visualization of the ansatz with the Ring topology with 4 input qubits and 2 layers in our designed VQC.
  • Figure 3: The learning curves for VQR-DQN and other algorithms in 50,000 episodes for different HRAP configurations (with their action space).
  • Figure 4: Graphs of the topologies observed in different quantum computer architectures.