Table of Contents
Fetching ...

PlanarTrack: A high-quality and challenging benchmark for large-scale planar object tracking

Yifan Jiao, Xinran Liu, Xiaoqiong Liu, Xiaohui Yuan, Heng Fan, Libo Zhang

TL;DR

PlanarTrack tackles the lack of large-scale, high-quality data for planar object tracking by introducing a dedicated benchmark with 1,150 unconstrained sequences and over 733K frames, each frame annotated with four corner points. It emphasizes long-term evaluation through a 150-sequence subset and adds PlanarTrackBB to assess how well generic trackers handle planar-like targets, uncovering substantial performance gaps. Through evaluations of 10 trackers and retraining experiments, PlanarTrack demonstrates that current methods struggle on diverse, real-world data and that scaling training data yields measurable gains. The dataset is poised to accelerate progress in robust planar tracking for AR and robotics and supports future directions like temporal modeling, multi-modal cues, and re-detection.

Abstract

Planar tracking has drawn increasing interest owing to its key roles in robotics and augmented reality. Despite recent great advancement, further development of planar tracking, particularly in the deep learning era, is largely limited compared to generic tracking due to the lack of large-scale platforms. To mitigate this, we propose PlanarTrack, a large-scale high-quality and challenging benchmark for planar tracking. Specifically, PlanarTrack consists of 1,150 sequences with over 733K frames, including 1,000 short-term and 150 new long-term videos, which enables comprehensive evaluation of short- and long-term tracking performance. All videos in PlanarTrack are recorded in unconstrained conditions from the wild, which makes PlanarTrack challenging but more realistic for real-world applications. To ensure high-quality annotations, each video frame is manually annotated by four corner points with multi-round meticulous inspection and refinement. To enhance target diversity of PlanarTrack, we only capture a unique target in one sequence, which is different from existing benchmarks. To our best knowledge, PlanarTrack is by far the largest and most diverse and challenging dataset dedicated to planar tracking. To understand performance of existing methods on PlanarTrack and to provide a comparison for future research, we evaluate 10 representative planar trackers with extensive comparison and in-depth analysis. Our evaluation reveals that, unsurprisingly, the top planar trackers heavily degrade on the challenging PlanarTrack, which indicates more efforts are required for improving planar tracking. Our data and results will be released at https://github.com/HengLan/PlanarTrack

PlanarTrack: A high-quality and challenging benchmark for large-scale planar object tracking

TL;DR

PlanarTrack tackles the lack of large-scale, high-quality data for planar object tracking by introducing a dedicated benchmark with 1,150 unconstrained sequences and over 733K frames, each frame annotated with four corner points. It emphasizes long-term evaluation through a 150-sequence subset and adds PlanarTrackBB to assess how well generic trackers handle planar-like targets, uncovering substantial performance gaps. Through evaluations of 10 trackers and retraining experiments, PlanarTrack demonstrates that current methods struggle on diverse, real-world data and that scaling training data yields measurable gains. The dataset is poised to accelerate progress in robust planar tracking for AR and robotics and supports future directions like temporal modeling, multi-modal cues, and re-detection.

Abstract

Planar tracking has drawn increasing interest owing to its key roles in robotics and augmented reality. Despite recent great advancement, further development of planar tracking, particularly in the deep learning era, is largely limited compared to generic tracking due to the lack of large-scale platforms. To mitigate this, we propose PlanarTrack, a large-scale high-quality and challenging benchmark for planar tracking. Specifically, PlanarTrack consists of 1,150 sequences with over 733K frames, including 1,000 short-term and 150 new long-term videos, which enables comprehensive evaluation of short- and long-term tracking performance. All videos in PlanarTrack are recorded in unconstrained conditions from the wild, which makes PlanarTrack challenging but more realistic for real-world applications. To ensure high-quality annotations, each video frame is manually annotated by four corner points with multi-round meticulous inspection and refinement. To enhance target diversity of PlanarTrack, we only capture a unique target in one sequence, which is different from existing benchmarks. To our best knowledge, PlanarTrack is by far the largest and most diverse and challenging dataset dedicated to planar tracking. To understand performance of existing methods on PlanarTrack and to provide a comparison for future research, we evaluate 10 representative planar trackers with extensive comparison and in-depth analysis. Our evaluation reveals that, unsurprisingly, the top planar trackers heavily degrade on the challenging PlanarTrack, which indicates more efforts are required for improving planar tracking. Our data and results will be released at https://github.com/HengLan/PlanarTrack

Paper Structure

This paper contains 23 sections, 1 equation, 15 figures, 7 tables.

Figures (15)

  • Figure 1: Comparison between generic object tracking (a) and planar object tracking (b). The former estimates axis-aligned rectangular bounding boxes for the target object, while the latter (our focus in this work) calculates 2D transformations of the target object to obtain the corresponding corner points for localization. All figures throughout this paper are best viewed in color and by zooming in.
  • Figure 2: Summary of planar object tracking datasets, containing POT-280 liang2021planar, POT-210 liang2018planar, TMT roy2015tracking, UCSB gauglitz2011evaluation, Metiao lieberknecht2009dataset, POIC chen2017illumination, our PlanarTrack and PlanarTrack$^{*}$ from conference version liu2023planartrack. The circle diameter is in proportion to the number of frames of a dataset. Our PlanarTrack is the largest benchmark.
  • Figure 3: Distribution of classes and scenarios in all sequences. (a): Planar targets can be divided into 21 classes. Four representative classes are highlighted. (b): Videos are all collected in these 19 scenarios.
  • Figure 4: Examples of annotated sequences in the proposed PlanarTrack. Each video is annotated with four corner points.
  • Figure 5: Statistics of planar target motion, size, relative area compared to initial object and IoU of targets in adjacent frames in PlanarTrack and comparison with the recent POT-210/280 liang2018planarliang2021planar. We can see the targets in our dataset have smaller sizes and faster and more challenging motions.
  • ...and 10 more figures