Integrated trucks assignment and scheduling problem with mixed service mode docks: A Q-learning based adaptive large neighborhood search algorithm
Yueyi Li, Mehrdad Mohammadi, Xiaodong Zhang, Yunxing Lan, Willem van Jaarsveld
TL;DR
This work tackles the integrated Truck Assignment and Scheduling Problem with Dock Mode Decision (TASP-DMD) in mixed-service-mode docks, formulating a multiobjective NP-hard problem that jointly decides dock modes, truck assignments, and sequencing. It introduces a Q-learning–based Adaptive Large Neighborhood Search (Q-ALNS) that uses a two-layer loop to perturb dock modes and to optimize truck operations, with Q-learning guiding operator selection and maintaining a Pareto front. Through a three-phase experimental study, the approach consistently achieves tighter optimality gaps and richer Pareto fronts than benchmark algorithms, particularly on larger instances, and demonstrates adaptive dock-mode decisions that outperform pre-set strategies. The findings demonstrate the practical value of adaptive dock modes and learning-guided ALNS in improving tardiness, makespan, and cargo handling efficiency in unmanned distribution centers, while highlighting avenues for handling uncertainty and automated parameter tuning in future work.
Abstract
Mixed service mode docks enhance efficiency by flexibly handling both loading and unloading trucks in warehouses. However, existing research often predetermines the number and location of these docks prior to planning truck assignment and sequencing. This paper proposes a new model integrating dock mode decision, truck assignment, and scheduling, thus enabling adaptive dock mode arrangements. Specifically, we introduce a Q-learning-based adaptive large neighborhood search (Q-ALNS) algorithm to address the integrated problem. The algorithm adjusts dock modes via perturbation operators, while truck assignment and scheduling are solved using destroy and repair local search operators. Q-learning adaptively selects these operators based on their performance history and future gains, employing the epsilon-greedy strategy. Extensive experimental results and statistical analysis indicate that the Q-ALNS benefits from efficient operator combinations and its adaptive mechanism, consistently outperforming benchmark algorithms in terms of optimality gap and Pareto front discovery. In comparison to the predetermined service mode, our adaptive strategy results in lower average tardiness and makespan, highlighting its superior adaptability to varying demands.
