JPDS-NN: Reinforcement Learning-Based Dynamic Task Allocation for Agricultural Vehicle Routing Optimization
Yixuan Fan, Haotian Xu, Mengqiao Liu, Qing Zhuo, Tao Zhang
TL;DR
This work tackles the Entrance Dependent Vehicle Routing Problem (EDVRP) in agriculture, where field geometry and entrance locations critically affect routing. It introduces JPDS-NN, an encoder–decoder network with graph transformers and attention that models routing as a Markov Decision Process and is trained with Proximal Policy Optimization to optimize distance, time, and fuel. Key contributions include a graph-transformer–based input encoder with a pre-training task, a generated-sequence GRU encoder, a joint node-entrance action sampler, and comprehensive ablations plus dynamic rearrangement experiments. Empirical results show JPDS-NN achieves substantial reductions in travel distance (48.4–65.4%), fuel usage (14.0–17.6%), and two-order-of-magnitude faster computation than baselines, with 15–25% gains in dynamic scenarios, indicating strong practical value for scalable, intelligent agricultural routing. The approach also demonstrates the importance of cross-attention and pre-training for robust performance in complex, dynamic field environments.
Abstract
The Entrance Dependent Vehicle Routing Problem (EDVRP) is a variant of the Vehicle Routing Problem (VRP) where the scale of cities influences routing outcomes, necessitating consideration of their entrances. This paper addresses EDVRP in agriculture, focusing on multi-parameter vehicle planning for irregularly shaped fields. To address the limitations of traditional methods, such as heuristic approaches, which often overlook field geometry and entrance constraints, we propose a Joint Probability Distribution Sampling Neural Network (JPDS-NN) to effectively solve the EDVRP. The network uses an encoder-decoder architecture with graph transformers and attention mechanisms to model routing as a Markov Decision Process, and is trained via reinforcement learning for efficient and rapid end-to-end planning. Experimental results indicate that JPDS-NN reduces travel distances by 48.4-65.4%, lowers fuel consumption by 14.0-17.6%, and computes two orders of magnitude faster than baseline methods, while demonstrating 15-25% superior performance in dynamic arrangement scenarios. Ablation studies validate the necessity of cross-attention and pre-training. The framework enables scalable, intelligent routing for large-scale farming under dynamic constraints.
