NYC-Indoor-VPR: A Long-Term Indoor Visual Place Recognition Dataset with Semi-Automatic Annotation
Diwei Sheng, Anbang Yang, John-Ross Rizzo, Chen Feng
TL;DR
This work tackles long-term indoor visual place recognition by presenting NYC-Indoor-VPR, a year-long dataset with over 36k images across 13 crowded indoor scenes in New York City. It introduces a semi-automatic ground-truth annotation pipeline that derives topometric frame locations from paired video trajectories, enabling precise VPR benchmarking without full 3D reconstructions. Benchmark experiments across six state-of-the-art VPR methods reveal substantial challenges posed by indoor dynamics, perceptual aliasing, and occlusions, highlighting the dataset's value for driving advances in indoor VPR. The dataset and annotation tools are publicly available to support future research and method development in indoor localization and navigation.
Abstract
Visual Place Recognition (VPR) in indoor environments is beneficial to humans and robots for better localization and navigation. It is challenging due to appearance changes at various frequencies, and difficulties of obtaining ground truth metric trajectories for training and evaluation. This paper introduces the NYC-Indoor-VPR dataset, a unique and rich collection of over 36,000 images compiled from 13 distinct crowded scenes in New York City taken under varying lighting conditions with appearance changes. Each scene has multiple revisits across a year. To establish the ground truth for VPR, we propose a semiautomatic annotation approach that computes the positional information of each image. Our method specifically takes pairs of videos as input and yields matched pairs of images along with their estimated relative locations. The accuracy of this matching is refined by human annotators, who utilize our annotation software to correlate the selected keyframes. Finally, we present a benchmark evaluation of several state-of-the-art VPR algorithms using our annotated dataset, revealing its challenge and thus value for VPR research.
