Table of Contents
Fetching ...

Pallet Detection And Localisation From Synthetic Data

Henri Mueller, Yechan Kim, Trevor Gee, Mahla Nejati

TL;DR

The paper tackles efficient pallet detection and localisation using purely synthetic data generated with domain randomisation in Unity, aiming to eliminate manual image annotation. It combines YOLOv8-based detection with a corner-detection and PnP pose-estimation pipeline to recover 3D pallet pose from monocular RGB imagery. The authors demonstrate strong real-world performance, achieving $mAP50$ of $0.995$ for single pallets and sub-5 cm translation with a few-degree rotation within 5 m, validating the approach's practical potential. The work highlights the viability of synthetic data to bridge the reality gap in warehouse robotics and outlines concrete directions for extending the method to multiple pallets and real-world deployments.

Abstract

The global warehousing industry is experiencing rapid growth, with the market size projected to grow at an annual rate of 8.1% from 2024 to 2030 [Grand View Research, 2021]. This expansion has led to a surge in demand for efficient pallet detection and localisation systems. While automation can significantly streamline warehouse operations, the development of such systems often requires extensive manual data annotation, with an average of 35 seconds per image, for a typical computer vision project. This paper presents a novel approach to enhance pallet detection and localisation using purely synthetic data and geometric features derived from their side faces. By implementing a domain randomisation engine in Unity, the need for time-consuming manual annotation is eliminated while achieving high-performance results. The proposed method demonstrates a pallet detection performance of 0.995 mAP50 for single pallets on a real-world dataset. Additionally, an average position accuracy of less than 4.2 cm and an average rotation accuracy of 8.2° were achieved for pallets within a 5-meter range, with the pallet positioned head-on.

Pallet Detection And Localisation From Synthetic Data

TL;DR

The paper tackles efficient pallet detection and localisation using purely synthetic data generated with domain randomisation in Unity, aiming to eliminate manual image annotation. It combines YOLOv8-based detection with a corner-detection and PnP pose-estimation pipeline to recover 3D pallet pose from monocular RGB imagery. The authors demonstrate strong real-world performance, achieving of for single pallets and sub-5 cm translation with a few-degree rotation within 5 m, validating the approach's practical potential. The work highlights the viability of synthetic data to bridge the reality gap in warehouse robotics and outlines concrete directions for extending the method to multiple pallets and real-world deployments.

Abstract

The global warehousing industry is experiencing rapid growth, with the market size projected to grow at an annual rate of 8.1% from 2024 to 2030 [Grand View Research, 2021]. This expansion has led to a surge in demand for efficient pallet detection and localisation systems. While automation can significantly streamline warehouse operations, the development of such systems often requires extensive manual data annotation, with an average of 35 seconds per image, for a typical computer vision project. This paper presents a novel approach to enhance pallet detection and localisation using purely synthetic data and geometric features derived from their side faces. By implementing a domain randomisation engine in Unity, the need for time-consuming manual annotation is eliminated while achieving high-performance results. The proposed method demonstrates a pallet detection performance of 0.995 mAP50 for single pallets on a real-world dataset. Additionally, an average position accuracy of less than 4.2 cm and an average rotation accuracy of 8.2° were achieved for pallets within a 5-meter range, with the pallet positioned head-on.

Paper Structure

This paper contains 25 sections, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Overview of the localisation System.
  • Figure 2: Visualisation of a localised pallet using the proposed system
  • Figure 3: Domain Randomisation Scene.
  • Figure 4: Localisation Error Visualisation.
  • Figure 5: Comparative Position Error
  • ...and 4 more figures