An Efficient Deep Reinforcement Learning Model for Online 3D Bin Packing Combining Object Rearrangement and Stable Placement
Peiwen Zhou, Ziyan Gao, Chenghao Li, Nak Young Chong
TL;DR
This paper tackles online 3D bin packing (3D-BPP), an NP-hard problem in logistics, by combining a two-agent deep reinforcement learning (DRL) framework with a highly reliable physics-based stability heuristic and an object rearrangement capability. It frames packing as an MDP $M=\langle S,A,P,R,\gamma\rangle$ with two policies $\pi_o$ (orientation) and $\pi_p$ (placement), operating on depth-based heightmaps to select $(o,p)$ actions. Key contributions include convexHull-based stability checks (convexHull-1 and convexHull-α), integration of a physics heuristic into PPO-based DRL, and empirical validation showing higher space utilization with fewer training epochs while ensuring placement stability. The approach offers practical impact for real-time online packing in warehouses by improving utilization and training efficiency under real-time constraints.
Abstract
This paper presents an efficient deep reinforcement learning (DRL) framework for online 3D bin packing (3D-BPP). The 3D-BPP is an NP-hard problem significant in logistics, warehousing, and transportation, involving the optimal arrangement of objects inside a bin. Traditional heuristic algorithms often fail to address dynamic and physical constraints in real-time scenarios. We introduce a novel DRL framework that integrates a reliable physics heuristic algorithm and object rearrangement and stable placement. Our experiment show that the proposed framework achieves higher space utilization rates effectively minimizing the amount of wasted space with fewer training epochs.
