Table of Contents
Fetching ...

AI2STOW: End-to-End Deep Reinforcement Learning to Construct Master Stowage Plans under Demand Uncertainty

Jaike Van Twiller, Djordje Grbic, Rune Møller Jensen

TL;DR

AI2STOW tackles master stowage planning under demand uncertainty by extending the MPP to include paired block stowage (PBS) patterns and using an end-to-end DRL policy with feasibility projection and an action-mask. The architecture combines a self-attention based encoder-decoder with PBS-aware action masking and projection layers, enabling scalable, adaptive solutions for large vessels and realistic voyage horizons. Empirical results show AI2STOW outperforms stochastic programming baselines (SMIP-NA, SMIP-PI) and prior DRL approaches in both objective value and computational efficiency, while generalizing to longer voyages. The approach demonstrates the viability of DRL for end-to-end stowage planning under uncertainty and highlights avenues for integration with slot planning and hybrid ML-CO methods.

Abstract

The worldwide economy and environmental sustainability depend on eff icient and reliable supply chains, in which container shipping plays a crucial role as an environmentally friendly mode of transport. Liner shipping companies seek to improve operational efficiency by solving the stowage planning problem. Due to many complex combinatorial aspects, stowage planning is challenging and often decomposed into two NP-hard subproblems: master and slot planning. This article proposes AI2STOW, an end-to-end deep reinforcement learning model with feasibility projection and an action mask to create master plans under demand uncertainty with global objectives and constraints, including paired block stowage patterms. Our experimental results demonstrate that AI2STOW outperforms baseline methods from reinforcement learning and stochastic programming in objective performance and computational efficiency, based on simulated instances reflecting the scale of realistic vessels and operational planning horizons.

AI2STOW: End-to-End Deep Reinforcement Learning to Construct Master Stowage Plans under Demand Uncertainty

TL;DR

AI2STOW tackles master stowage planning under demand uncertainty by extending the MPP to include paired block stowage (PBS) patterns and using an end-to-end DRL policy with feasibility projection and an action-mask. The architecture combines a self-attention based encoder-decoder with PBS-aware action masking and projection layers, enabling scalable, adaptive solutions for large vessels and realistic voyage horizons. Empirical results show AI2STOW outperforms stochastic programming baselines (SMIP-NA, SMIP-PI) and prior DRL approaches in both objective value and computational efficiency, while generalizing to longer voyages. The approach demonstrates the viability of DRL for end-to-end stowage planning under uncertainty and highlights avenues for integration with slot planning and hybrid ML-CO methods.

Abstract

The worldwide economy and environmental sustainability depend on eff icient and reliable supply chains, in which container shipping plays a crucial role as an environmentally friendly mode of transport. Liner shipping companies seek to improve operational efficiency by solving the stowage planning problem. Due to many complex combinatorial aspects, stowage planning is challenging and often decomposed into two NP-hard subproblems: master and slot planning. This article proposes AI2STOW, an end-to-end deep reinforcement learning model with feasibility projection and an action mask to create master plans under demand uncertainty with global objectives and constraints, including paired block stowage patterms. Our experimental results demonstrate that AI2STOW outperforms baseline methods from reinforcement learning and stochastic programming in objective performance and computational efficiency, based on simulated instances reflecting the scale of realistic vessels and operational planning horizons.

Paper Structure

This paper contains 32 sections, 23 equations, 7 figures, 8 tables, 6 algorithms.

Figures (7)

  • Figure 1: Vessel side and top view van_twiller_literature_2024
  • Figure 2: Vessel front view van_twiller_literature_2024
  • Figure 3: Hierarchical decomposition of stowage planning pacino_fast_2011
  • Figure 4: Deep reinforcement learning architecture with feasibility projection for actor-critic methods van_twiller_navigating_2025
  • Figure 5: Layers of the encoder and the actor-critic decoder van_twiller_navigating_2025
  • ...and 2 more figures