Table of Contents
Fetching ...

Vehicle Routing Problems via Quantum Graph Attention Network Deep Reinforcement Learning

Le Tung Giang, Vu Hoang Viet, Nguyen Xuan Tung, Trinh Van Chien, Won-Joo Hwang

TL;DR

The paper tackles the Vehicle Routing Problem (VRP), a classical NP-hard routing optimization task, by integrating a Quantum Graph Attention Network (Q-GAT) into a deep reinforcement learning framework. The core idea is to replace traditional MLP readouts in graph attention layers with parameterized quantum circuits (PQC), achieving a more compact yet expressive encoder. Training with Proximal Policy Optimization (PPO) and using a Pointer Network-based decoder, the approach yields faster convergence and about a 5% improvement in routing cost over classical GAT baselines, while reducing trainable parameters by over 50%. These results demonstrate the practicality of PQC-enhanced GNNs for large-scale logistics optimization and suggest broad applicability to other routing and scheduling problems.

Abstract

The vehicle routing problem (VRP) is a fundamental NP-hard task in intelligent transportation systems with broad applications in logistics and distribution. Deep reinforcement learning (DRL) with Graph Neural Networks (GNNs) has shown promise, yet classical models rely on large multi-layer perceptrons (MLPs) that are parameter-heavy and memory-bound. We propose a Quantum Graph Attention Network (Q-GAT) within a DRL framework, where parameterized quantum circuits (PQCs) replace conventional MLPs at critical readout stages. The hybrid model maintains the expressive capacity of graph attention encoders while reducing trainable parameters by more than 50%. Using proximal policy optimization (PPO) with greedy and stochastic decoding, experiments on VRP benchmarks show that Q-GAT achieves faster convergence and reduces routing cost by about 5% compared with classical GAT baselines. These results demonstrate the potential of PQC-enhanced GNNs as compact and effective solvers for large-scale routing and logistics optimization.

Vehicle Routing Problems via Quantum Graph Attention Network Deep Reinforcement Learning

TL;DR

The paper tackles the Vehicle Routing Problem (VRP), a classical NP-hard routing optimization task, by integrating a Quantum Graph Attention Network (Q-GAT) into a deep reinforcement learning framework. The core idea is to replace traditional MLP readouts in graph attention layers with parameterized quantum circuits (PQC), achieving a more compact yet expressive encoder. Training with Proximal Policy Optimization (PPO) and using a Pointer Network-based decoder, the approach yields faster convergence and about a 5% improvement in routing cost over classical GAT baselines, while reducing trainable parameters by over 50%. These results demonstrate the practicality of PQC-enhanced GNNs for large-scale logistics optimization and suggest broad applicability to other routing and scheduling problems.

Abstract

The vehicle routing problem (VRP) is a fundamental NP-hard task in intelligent transportation systems with broad applications in logistics and distribution. Deep reinforcement learning (DRL) with Graph Neural Networks (GNNs) has shown promise, yet classical models rely on large multi-layer perceptrons (MLPs) that are parameter-heavy and memory-bound. We propose a Quantum Graph Attention Network (Q-GAT) within a DRL framework, where parameterized quantum circuits (PQCs) replace conventional MLPs at critical readout stages. The hybrid model maintains the expressive capacity of graph attention encoders while reducing trainable parameters by more than 50%. Using proximal policy optimization (PPO) with greedy and stochastic decoding, experiments on VRP benchmarks show that Q-GAT achieves faster convergence and reduces routing cost by about 5% compared with classical GAT baselines. These results demonstrate the potential of PQC-enhanced GNNs as compact and effective solvers for large-scale routing and logistics optimization.

Paper Structure

This paper contains 16 sections, 21 equations, 3 figures, 2 tables, 1 algorithm.

Figures (3)

  • Figure 1: The architecture of the model
  • Figure 2: (a) The architecture of the DRL model. (b) The Quantum Graph Attention Network framework.
  • Figure 3: Training and testing loss versus the number of epochs with $20$ customers.