Table of Contents
Fetching ...

Mobile Robot Oriented Large-Scale Indoor Dataset for Dynamic Scene Understanding

Yifan Tang, Cong Tai, Fangxing Chen, Wanting Zhang, Tao Zhang, Xueping Liu, Yongjin Liu, Long Zeng

TL;DR

This work addresses the lack of dynamic content in mobile-robot benchmarks by introducing THUD, a large-scale indoor dataset comprising real and synthetic data across 13 dynamic scenarios. The authors detail acquisition pipelines using PUDUbot2 with Kinect V2 and a Unity3D simulation platform, along with dual annotation strategies that yield over 90k frames and more than 20 million labels for 91 object categories. They demonstrate that dynamic objects substantially challenge common perception tasks—3D object detection, semantic segmentation, and robot relocalization—with degraded performance that scales with dynamic density. THUD aims to accelerate development of mobile-robot perception in real dynamic environments and is continuously expanded to support additional tasks such as navigation, tracking, and trajectory prediction, ultimately bridging simulation and real-world robot deployment.

Abstract

Most existing robotic datasets capture static scene data and thus are limited in evaluating robots' dynamic performance. To address this, we present a mobile robot oriented large-scale indoor dataset, denoted as THUD (Tsinghua University Dynamic) robotic dataset, for training and evaluating their dynamic scene understanding algorithms. Specifically, the THUD dataset construction is first detailed, including organization, acquisition, and annotation methods. It comprises both real-world and synthetic data, collected with a real robot platform and a physical simulation platform, respectively. Our current dataset includes 13 larges-scale dynamic scenarios, 90K image frames, 20M 2D/3D bounding boxes of static and dynamic objects, camera poses, and IMU. The dataset is still continuously expanding. Then, the performance of mainstream indoor scene understanding tasks, e.g. 3D object detection, semantic segmentation, and robot relocalization, is evaluated on our THUD dataset. These experiments reveal serious challenges for some robot scene understanding tasks in dynamic scenes. By sharing this dataset, we aim to foster and iterate new mobile robot algorithms quickly for robot actual working dynamic environment, i.e. complex crowded dynamic scenes.

Mobile Robot Oriented Large-Scale Indoor Dataset for Dynamic Scene Understanding

TL;DR

This work addresses the lack of dynamic content in mobile-robot benchmarks by introducing THUD, a large-scale indoor dataset comprising real and synthetic data across 13 dynamic scenarios. The authors detail acquisition pipelines using PUDUbot2 with Kinect V2 and a Unity3D simulation platform, along with dual annotation strategies that yield over 90k frames and more than 20 million labels for 91 object categories. They demonstrate that dynamic objects substantially challenge common perception tasks—3D object detection, semantic segmentation, and robot relocalization—with degraded performance that scales with dynamic density. THUD aims to accelerate development of mobile-robot perception in real dynamic environments and is continuously expanded to support additional tasks such as navigation, tracking, and trajectory prediction, ultimately bridging simulation and real-world robot deployment.

Abstract

Most existing robotic datasets capture static scene data and thus are limited in evaluating robots' dynamic performance. To address this, we present a mobile robot oriented large-scale indoor dataset, denoted as THUD (Tsinghua University Dynamic) robotic dataset, for training and evaluating their dynamic scene understanding algorithms. Specifically, the THUD dataset construction is first detailed, including organization, acquisition, and annotation methods. It comprises both real-world and synthetic data, collected with a real robot platform and a physical simulation platform, respectively. Our current dataset includes 13 larges-scale dynamic scenarios, 90K image frames, 20M 2D/3D bounding boxes of static and dynamic objects, camera poses, and IMU. The dataset is still continuously expanding. Then, the performance of mainstream indoor scene understanding tasks, e.g. 3D object detection, semantic segmentation, and robot relocalization, is evaluated on our THUD dataset. These experiments reveal serious challenges for some robot scene understanding tasks in dynamic scenes. By sharing this dataset, we aim to foster and iterate new mobile robot algorithms quickly for robot actual working dynamic environment, i.e. complex crowded dynamic scenes.
Paper Structure (16 sections, 8 figures, 3 tables)

This paper contains 16 sections, 8 figures, 3 tables.

Figures (8)

  • Figure 1: PUDUbot2 & Kinect V2 joint collection platform
  • Figure 2: Scenes with varying levels of dynamic complexity
  • Figure 3: Mobile robot synthetic data acquisition platform.
  • Figure 4: Dynamic and special scenarios
  • Figure 5: Statistics of annotations in our dataset.
  • ...and 3 more figures