Table of Contents
Fetching ...

GASE: Graph Attention Sampling with Edges Fusion for Solving Vehicle Routing Problems

Zhenwei Wang, Ruibin Bai, Fazlullah Khan, Ender Ozcan, Tiehua Zhang

TL;DR

This paper tackles VRP with a data-driven end-to-end approach by introducing GASE, which combines a residual graph attention sampling encoder with an adaptive multi-head attention decoder. The encoder selects top-$K$ highly related neighbors via a learned attention matrix and a filtering mechanism, producing compact node/edge representations, while the decoder generates routes under capacity constraints with masking and stochastic policy optimization. Training uses an adaptive actor–critic framework with a self-critic baseline to improve convergence and generalization, showing strong performance on randomly generated VRP instances and CVRPLIB benchmarks. The results indicate improved solution quality and faster inference relative to baselines, highlighting the method’s scalability and practical impact for real-world routing under diverse distributions, though generalization remains a key consideration for future work.

Abstract

Learning-based methods have become increasingly popular for solving vehicle routing problems due to their near-optimal performance and fast inference speed. Among them, the combination of deep reinforcement learning and graph representation allows for the abstraction of node topology structures and features in an encoder-decoder style. Such an approach makes it possible to solve routing problems end-to-end without needing complicated heuristic operators designed by domain experts. Existing research studies have been focusing on novel encoding and decoding structures via various neural network models to enhance the node embedding representation. Despite the sophisticated approaches applied, there is a noticeable lack of consideration for the graph-theoretic properties inherent to routing problems. Moreover, the potential ramifications of inter-nodal interactions on the decision-making efficacy of the models have not been adequately explored. To bridge this gap, we propose an adaptive Graph Attention Sampling with the Edges Fusion framework (GASE),where nodes' embedding is determined through attention calculation from certain highly correlated neighbourhoods and edges, utilizing a filtered adjacency matrix. In detail, the selections of particular neighbours and adjacency edges are led by a multi-head attention mechanism, contributing directly to the message passing and node embedding in graph attention sampling networks. Furthermore, we incorporate an adaptive actor-critic algorithm with policy improvements to expedite the training convergence. We then conduct comprehensive experiments against baseline methods on learning-based VRP tasks from different perspectives. Our proposed model outperforms the existing methods by 2.08\%-6.23\% and shows stronger generalization ability, achieving state-of-the-art performance on randomly generated instances and real-world datasets.

GASE: Graph Attention Sampling with Edges Fusion for Solving Vehicle Routing Problems

TL;DR

This paper tackles VRP with a data-driven end-to-end approach by introducing GASE, which combines a residual graph attention sampling encoder with an adaptive multi-head attention decoder. The encoder selects top- highly related neighbors via a learned attention matrix and a filtering mechanism, producing compact node/edge representations, while the decoder generates routes under capacity constraints with masking and stochastic policy optimization. Training uses an adaptive actor–critic framework with a self-critic baseline to improve convergence and generalization, showing strong performance on randomly generated VRP instances and CVRPLIB benchmarks. The results indicate improved solution quality and faster inference relative to baselines, highlighting the method’s scalability and practical impact for real-world routing under diverse distributions, though generalization remains a key consideration for future work.

Abstract

Learning-based methods have become increasingly popular for solving vehicle routing problems due to their near-optimal performance and fast inference speed. Among them, the combination of deep reinforcement learning and graph representation allows for the abstraction of node topology structures and features in an encoder-decoder style. Such an approach makes it possible to solve routing problems end-to-end without needing complicated heuristic operators designed by domain experts. Existing research studies have been focusing on novel encoding and decoding structures via various neural network models to enhance the node embedding representation. Despite the sophisticated approaches applied, there is a noticeable lack of consideration for the graph-theoretic properties inherent to routing problems. Moreover, the potential ramifications of inter-nodal interactions on the decision-making efficacy of the models have not been adequately explored. To bridge this gap, we propose an adaptive Graph Attention Sampling with the Edges Fusion framework (GASE),where nodes' embedding is determined through attention calculation from certain highly correlated neighbourhoods and edges, utilizing a filtered adjacency matrix. In detail, the selections of particular neighbours and adjacency edges are led by a multi-head attention mechanism, contributing directly to the message passing and node embedding in graph attention sampling networks. Furthermore, we incorporate an adaptive actor-critic algorithm with policy improvements to expedite the training convergence. We then conduct comprehensive experiments against baseline methods on learning-based VRP tasks from different perspectives. Our proposed model outperforms the existing methods by 2.08\%-6.23\% and shows stronger generalization ability, achieving state-of-the-art performance on randomly generated instances and real-world datasets.
Paper Structure (20 sections, 24 equations, 5 figures, 4 tables, 1 algorithm)

This paper contains 20 sections, 24 equations, 5 figures, 4 tables, 1 algorithm.

Figures (5)

  • Figure 1: An end-to-end GASE schema pipeline
  • Figure 2: Encoder with Attention Sampling, using node 2 as an example. The process involves sampling the top 2 nodes that are highly related to node 2 for aggregation. It's important to note that the attention matrix has rows representing the aggregating nodes, while the column elements represent the attention coefficients of the neighbouring nodes. The red box illustrates the operation process of a single node. All nodes execute their operations via matrix operations. The encoder is a residual network consisting of L layers. Each layer node combines the features of its top K neighbouring nodes and edges. The encoder's outputs are the hidden embedding $H^{(L)}$, which represents the node embedding, and the average value $Z(\mathbf{g})$, which shows the graph representation, that both are calculated after L layer residuals are computed.
  • Figure 3: Validation performance of GASE for several skip-connection layers on problem size 20/50 and different sampling rate $K$.
  • Figure 4: Validation performance of GASE for different skip-connection layers on problem size 20 and various sampling rate $K$.
  • Figure 5: Validation performance of GASE for different attention heads on problem size 20 and various sampling rate $K$.