FROG: A new people detection dataset for knee-high 2D range finders
Fernando Amodeo, Noé Pérez-Higueras, Luis Merino, Fernando Caballero
TL;DR
FROG addresses the challenge of detecting humans with knee-high 2D LiDAR by releasing a fully annotated, diverse 2D LiDAR dataset collected in a public space, along with a fast end-to-end detector and a benchmarking suite. It introduces two deep networks, Laser Feature Extractor (LFE) and People Proposal Network (PPN), that operate directly on raw scans, enabling high-speed ROS inference and reducing reliance on hand-crafted preprocessing. The paper benchmarks several detectors (DROW3, DR-SPAAM, PeTra) against the proposed methods, showing competitive accuracy with notably faster inference for LFE/PPN, and discusses annotation tooling, data formats, and evaluation methodology to standardize 2D LiDAR-based people detection research. Overall, FROG advances practical human detection for mobile robots using 2D LiDAR and provides a reusable framework for future improvements and extensions, including self-supervised learning and sensor fusion approaches.
Abstract
Mobile robots require knowledge of the environment, especially of humans located in its vicinity. While the most common approaches for detecting humans involve computer vision, an often overlooked hardware feature of robots for people detection are their 2D range finders. These were originally intended for obstacle avoidance and mapping/SLAM tasks. In most robots, they are conveniently located at a height approximately between the ankle and the knee, so they can be used for detecting people too, and with a larger field of view and depth resolution compared to cameras. In this paper, we present a new dataset for people detection using knee-high 2D range finders called FROG. This dataset has greater laser resolution, scanning frequency, and more complete annotation data compared to existing datasets such as DROW. Particularly, the FROG dataset contains annotations for 100% of its laser scans (unlike DROW which only annotates 5%), 17x more annotated scans, 100x more people annotations, and over twice the distance traveled by the robot. We propose a benchmark based on the FROG dataset, and analyze a collection of state-of-the-art people detectors based on 2D range finder data. We also propose and evaluate a new end-to-end deep learning approach for people detection. Our solution works with the raw sensor data directly (not needing hand-crafted input data features), thus avoiding CPU preprocessing and releasing the developer of understanding specific domain heuristics. Experimental results show how the proposed people detector attains results comparable to the state of the art, while an optimized implementation for ROS can operate at more than 500 Hz.
