GOPT: Generalizable Online 3D Bin Packing via Transformer-based Deep Reinforcement Learning
Heng Xiong, Changrong Guo, Jian Peng, Kai Ding, Wenjie Chen, Xuchong Qiu, Long Bai, Jianfeng Xu
TL;DR
GOPT tackles online 3D bin packing under variable bin sizes by framing packing as an MDP and introducing a Placement Generator that yields a fixed set of placement candidates alongside a Packing Transformer that learns spatial relations between items and candidate spaces. Trained with PPO, GOPT achieves superior space utilization and packing counts while generalizing across unseen bin dimensions and items, demonstrated both in simulation and on a real robotic system. The Placement Generator constrains the action space to 2N options, independent of bin size, while the Packing Transformer uses cross-attention to fuse item and space features, enabling robust generalization. This work advances practical robotic packing by delivering a scalable, generalizable strategy applicable to diverse logistics scenarios, with future work targeting irregular shapes and improved real-world reliability.
Abstract
Robotic object packing has broad practical applications in the logistics and automation industry, often formulated by researchers as the online 3D Bin Packing Problem (3D-BPP). However, existing DRL-based methods primarily focus on enhancing performance in limited packing environments while neglecting the ability to generalize across multiple environments characterized by different bin dimensions. To this end, we propose GOPT, a generalizable online 3D Bin Packing approach via Transformer-based deep reinforcement learning (DRL). First, we design a Placement Generator module to yield finite subspaces as placement candidates and the representation of the bin. Second, we propose a Packing Transformer, which fuses the features of the items and bin, to identify the spatial correlation between the item to be packed and available sub-spaces within the bin. Coupling these two components enables GOPT's ability to perform inference on bins of varying dimensions. We conduct extensive experiments and demonstrate that GOPT not only achieves superior performance against the baselines, but also exhibits excellent generalization capabilities. Furthermore, the deployment with a robot showcases the practical applicability of our method in the real world. The source code will be publicly available at https://github.com/Xiong5Heng/GOPT.
