Table of Contents
Fetching ...

PMMA: The Polytechnique Montreal Mobility Aids Dataset

Qingwu Liu, Nicolas Saunier, Guillaume-Alexandre Bilodeau

TL;DR

PMMA presents a new outdoor mobility aids dataset for fine-grained pedestrian detection and tracking, consisting of over 28k annotated images across nine mobility-aid categories. The dataset is captured with a ZED 2 stereo camera from two viewpoints and annotated in COCO format, including occlusion and shadow labels. Seven detectors and three trackers were benchmarked under the MMDetection framework, revealing that YOLOX, Deformable DETR, and Faster R-CNN provide the strongest detection performance, while tracking differences are comparatively small and highly dependent on detector quality. The work contributes a publicly available benchmark and codebase to advance inclusive, safety-focused vision systems for mobility-impaired pedestrians in real-world outdoor environments.

Abstract

This study introduces a new object detection dataset of pedestrians using mobility aids, named PMMA. The dataset was collected in an outdoor environment, where volunteers used wheelchairs, canes, and walkers, resulting in nine categories of pedestrians: pedestrians, cane users, two types of walker users, whether walking or resting, five types of wheelchair users, including wheelchair users, people pushing empty wheelchairs, and three types of users pushing occupied wheelchairs, including the entire pushing group, the pusher and the person seated on the wheelchair. To establish a benchmark, seven object detection models (Faster R-CNN, CenterNet, YOLOX, DETR, Deformable DETR, DINO, and RT-DETR) and three tracking algorithms (ByteTrack, BOT-SORT, and OC-SORT) were implemented under the MMDetection framework. Experimental results show that YOLOX, Deformable DETR, and Faster R-CNN achieve the best detection performance, while the differences among the three trackers are relatively small. The PMMA dataset is publicly available at https://doi.org/10.5683/SP3/XJPQUG, and the video processing and model training code is available at https://github.com/DatasetPMMA/PMMA.

PMMA: The Polytechnique Montreal Mobility Aids Dataset

TL;DR

PMMA presents a new outdoor mobility aids dataset for fine-grained pedestrian detection and tracking, consisting of over 28k annotated images across nine mobility-aid categories. The dataset is captured with a ZED 2 stereo camera from two viewpoints and annotated in COCO format, including occlusion and shadow labels. Seven detectors and three trackers were benchmarked under the MMDetection framework, revealing that YOLOX, Deformable DETR, and Faster R-CNN provide the strongest detection performance, while tracking differences are comparatively small and highly dependent on detector quality. The work contributes a publicly available benchmark and codebase to advance inclusive, safety-focused vision systems for mobility-impaired pedestrians in real-world outdoor environments.

Abstract

This study introduces a new object detection dataset of pedestrians using mobility aids, named PMMA. The dataset was collected in an outdoor environment, where volunteers used wheelchairs, canes, and walkers, resulting in nine categories of pedestrians: pedestrians, cane users, two types of walker users, whether walking or resting, five types of wheelchair users, including wheelchair users, people pushing empty wheelchairs, and three types of users pushing occupied wheelchairs, including the entire pushing group, the pusher and the person seated on the wheelchair. To establish a benchmark, seven object detection models (Faster R-CNN, CenterNet, YOLOX, DETR, Deformable DETR, DINO, and RT-DETR) and three tracking algorithms (ByteTrack, BOT-SORT, and OC-SORT) were implemented under the MMDetection framework. Experimental results show that YOLOX, Deformable DETR, and Faster R-CNN achieve the best detection performance, while the differences among the three trackers are relatively small. The PMMA dataset is publicly available at https://doi.org/10.5683/SP3/XJPQUG, and the video processing and model training code is available at https://github.com/DatasetPMMA/PMMA.
Paper Structure (23 sections, 2 equations, 5 figures, 16 tables)

This paper contains 23 sections, 2 equations, 5 figures, 16 tables.

Figures (5)

  • Figure 1: Example frames of our Mobility Aids Dataset. There are nine categories, with detailed descriptions in Table \ref{['tab:pedestrian_categories']} and icons adapted from icons_1icons_2
  • Figure 2: Data collection area in the parking lot of Polytechnique Montréal
  • Figure 3: Illustration of the camera and the pole
  • Figure 4: Category-wise annotation counts for each video
  • Figure 5: Row-normalized confusion matrices of all detection methods on the test set. Each row and column represent the ground truth (GT) and predicted classes, respectively