AI2STOW: End-to-End Deep Reinforcement Learning to Construct Master Stowage Plans under Demand Uncertainty
Jaike Van Twiller, Djordje Grbic, Rune Møller Jensen
TL;DR
AI2STOW tackles master stowage planning under demand uncertainty by extending the MPP to include paired block stowage (PBS) patterns and using an end-to-end DRL policy with feasibility projection and an action-mask. The architecture combines a self-attention based encoder-decoder with PBS-aware action masking and projection layers, enabling scalable, adaptive solutions for large vessels and realistic voyage horizons. Empirical results show AI2STOW outperforms stochastic programming baselines (SMIP-NA, SMIP-PI) and prior DRL approaches in both objective value and computational efficiency, while generalizing to longer voyages. The approach demonstrates the viability of DRL for end-to-end stowage planning under uncertainty and highlights avenues for integration with slot planning and hybrid ML-CO methods.
Abstract
The worldwide economy and environmental sustainability depend on eff icient and reliable supply chains, in which container shipping plays a crucial role as an environmentally friendly mode of transport. Liner shipping companies seek to improve operational efficiency by solving the stowage planning problem. Due to many complex combinatorial aspects, stowage planning is challenging and often decomposed into two NP-hard subproblems: master and slot planning. This article proposes AI2STOW, an end-to-end deep reinforcement learning model with feasibility projection and an action mask to create master plans under demand uncertainty with global objectives and constraints, including paired block stowage patterms. Our experimental results demonstrate that AI2STOW outperforms baseline methods from reinforcement learning and stochastic programming in objective performance and computational efficiency, based on simulated instances reflecting the scale of realistic vessels and operational planning horizons.
