Table of Contents
Fetching ...

THUD++: Large-Scale Dynamic Indoor Scene Dataset and Benchmark for Mobile Robots

Zeshun Li, Fuhao Li, Wanting Zhang, Zijie Zheng, Xueping Liu, Yongjin Liu, Long Zeng

TL;DR

THUD++ introduces a large-scale dynamic indoor dataset for mobile robots, merging real-world and Unity3D-simulated data to enable robust evaluation of perception, prediction, and navigation in crowded environments. It provides an RGB-D dataset with over 90K frames and 20M 2D/3D bounding boxes, a 6k-pedestrian trajectory dataset, and a Unity-based simulation platform with a robot navigation emulator. Extensive benchmarks across 3D object detection, semantic segmentation, relocalization, trajectory prediction, and navigation highlight the performance degradation of state-of-the-art methods in dynamic indoor scenes and demonstrate the dataset’s value for developing robust, socially aware robotic systems. THUD++ aims to accelerate research and practical deployment by offering rich annotations, realistic dynamics, and a controllable testing environment. The work emphasizes the need for dynamic-scene understanding to advance real-world mobile robotics applications.

Abstract

Most existing mobile robotic datasets primarily capture static scenes, limiting their utility for evaluating robotic performance in dynamic environments. To address this, we present a mobile robot oriented large-scale indoor dataset, denoted as THUD++ (TsingHua University Dynamic) robotic dataset, for dynamic scene understanding. Our current dataset includes 13 large-scale dynamic scenarios, combining both real-world and synthetic data collected with a real robot platform and a physical simulation platform, respectively. The RGB-D dataset comprises over 90K image frames, 20M 2D/3D bounding boxes of static and dynamic objects, camera poses, and IMU. The trajectory dataset covers over 6,000 pedestrian trajectories in indoor scenes. Additionally, the dataset is augmented with a Unity3D-based simulation platform, allowing researchers to create custom scenes and test algorithms in a controlled environment. We evaluate state-of-the-art methods on THUD++ across mainstream indoor scene understanding tasks, e.g., 3D object detection, semantic segmentation, relocalization, pedestrian trajectory prediction, and navigation. Our experiments highlight the challenges mobile robots encounter in indoor environments, especially when navigating in complex, crowded, and dynamic scenes. By sharing this dataset, we aim to accelerate the development and testing of mobile robot algorithms, contributing to real-world robotic applications.

THUD++: Large-Scale Dynamic Indoor Scene Dataset and Benchmark for Mobile Robots

TL;DR

THUD++ introduces a large-scale dynamic indoor dataset for mobile robots, merging real-world and Unity3D-simulated data to enable robust evaluation of perception, prediction, and navigation in crowded environments. It provides an RGB-D dataset with over 90K frames and 20M 2D/3D bounding boxes, a 6k-pedestrian trajectory dataset, and a Unity-based simulation platform with a robot navigation emulator. Extensive benchmarks across 3D object detection, semantic segmentation, relocalization, trajectory prediction, and navigation highlight the performance degradation of state-of-the-art methods in dynamic indoor scenes and demonstrate the dataset’s value for developing robust, socially aware robotic systems. THUD++ aims to accelerate research and practical deployment by offering rich annotations, realistic dynamics, and a controllable testing environment. The work emphasizes the need for dynamic-scene understanding to advance real-world mobile robotics applications.

Abstract

Most existing mobile robotic datasets primarily capture static scenes, limiting their utility for evaluating robotic performance in dynamic environments. To address this, we present a mobile robot oriented large-scale indoor dataset, denoted as THUD++ (TsingHua University Dynamic) robotic dataset, for dynamic scene understanding. Our current dataset includes 13 large-scale dynamic scenarios, combining both real-world and synthetic data collected with a real robot platform and a physical simulation platform, respectively. The RGB-D dataset comprises over 90K image frames, 20M 2D/3D bounding boxes of static and dynamic objects, camera poses, and IMU. The trajectory dataset covers over 6,000 pedestrian trajectories in indoor scenes. Additionally, the dataset is augmented with a Unity3D-based simulation platform, allowing researchers to create custom scenes and test algorithms in a controlled environment. We evaluate state-of-the-art methods on THUD++ across mainstream indoor scene understanding tasks, e.g., 3D object detection, semantic segmentation, relocalization, pedestrian trajectory prediction, and navigation. Our experiments highlight the challenges mobile robots encounter in indoor environments, especially when navigating in complex, crowded, and dynamic scenes. By sharing this dataset, we aim to accelerate the development and testing of mobile robot algorithms, contributing to real-world robotic applications.

Paper Structure

This paper contains 22 sections, 10 figures, 6 tables.

Figures (10)

  • Figure 1: PUDUbot2&Kinect V2 integrated collection platform
  • Figure 2: Unity3D-based simulation platform
  • Figure 3: Scenes with varying levels of dynamic complexity
  • Figure 4: Dynamic and special simulation scenarios
  • Figure 5: Statistics of annotations in our dataset
  • ...and 5 more figures