MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking
Laura Leal-Taixé, Anton Milan, Ian Reid, Stefan Roth, Konrad Schindler
TL;DR
MOTChallenge tackles the lack of standardized evaluation for multi-target tracking by providing a public dataset, centralized scoring, and a crowdsourcing pathway for data and methods. It combines 2D and limited 3D MOT sequences, a uniform data/detection format, and a set of baseline trackers to enable fair comparisons and robust benchmarking. The paper also discusses annotation variability, evaluation metrics (CLEAR and track-quality measures), and practical considerations like runtime and ranking, aiming to advance generalizable tracking methods. This framework enables ongoing, transparent progress in multi-target tracking and invites community contributions across domains and modalities.
Abstract
In the recent past, the computer vision community has developed centralized benchmarks for the performance evaluation of a variety of tasks, including generic object and pedestrian detection, 3D reconstruction, optical flow, single-object short-term tracking, and stereo estimation. Despite potential pitfalls of such benchmarks, they have proved to be extremely helpful to advance the state of the art in the respective area. Interestingly, there has been rather limited work on the standardization of quantitative benchmarks for multiple target tracking. One of the few exceptions is the well-known PETS dataset, targeted primarily at surveillance applications. Despite being widely used, it is often applied inconsistently, for example involving using different subsets of the available data, different ways of training the models, or differing evaluation scripts. This paper describes our work toward a novel multiple object tracking benchmark aimed to address such issues. We discuss the challenges of creating such a framework, collecting existing and new data, gathering state-of-the-art methods to be tested on the datasets, and finally creating a unified evaluation system. With MOTChallenge we aim to pave the way toward a unified evaluation framework for a more meaningful quantification of multi-target tracking.
